HCI Host Encountered RAID Card Controller Error

Issue Description

There's alert indicates the host encountered RAID Card controller error.

Error/Warning Information


Handling Process

1. Check the raidstat through system backend.
Command: raidstat

- found there's no controller_model information obtained.

2. Check kernel log but no relevant info was located.

3. Execute cluster health check but no RAID Card error entity prompts.

4. Try to mark the alert as fixed and remove the host from the unhealthy host list.
- monitored for a while but the alert will prompt back afterward.

5. Check the RAID card status and model from IPMI.
- it is HP Smart Array P408i model. (checked from the compatible list, this model is supported after HCI 5.8.5 version.

Complete RAID card model name: HPE Smart Array P408i-a SR Gen10

Root Cause


Apply patch to resolve the above error.

1. Upload the patch to the /root directory.
2. Create a custom directory under /boot/firmware/current/.
i. cd to /boot/firmware/current.
ii. mkdir custom

3. Decompress the patch to /boot/firmware/current/custom directory.
Command: tar xf /root/01-sp-p408ia-p440ar-raidstat.tar.gz -C /boot/firmware/current/custom

4. Move the version file to /home directory.
Command: mv /boot/firmware/current/custom/01-sp-p408ia-p440ar-raidstat/version /home

5. Execute command below to synchronize file to root directory.
Command: rsync -av /boot/firmware/current/custom/01-sp-p408ia-p440ar-raidstat/* /

6. Move the version file returns to the patch directory.
Command: mv /home/version /boot/firmware/current/custom/01-sp-p408ia-p440ar-raidstat/

7. Check the raidstat again, it is now able to obtain the Raid card model information.


If the scenario varies from above or the solution doesn't work, kindly consult specialist or R&D for further verification.


Doc ID: 5719
Author: CTI Chris
Updated: 2022-01-28 08:53