【HCI Product Warning】VS slow disk which might affect production
  

CTI TF Lv2Posted 10 May 2022 18:31

Last edited by CTI TF 10 May 2022 18:50.

Bug Phenomena
When the HDD has been used for more than 2 years, the probability of hardware failure is increasing. Among them, "slow disk" is the most common failure of HDD.
After the HDD becomes a slow disk, the IO request does not return or the IO latency is high, resulting in the virtual machine not responding or slow in performance and thus affecting customer production.


Trigger conditions
The HDD disk has a hardware failure, resulting in no return of IO request or high IO latency. From experience, new hard disks within 3 months and hard disks that have been used for more than 2 years have a higher probability of occurrence.


Root Cause
After the HDD disk becomes a slow disk, the disk is still online, while the read and write operation is still continue on this disk. However, due to this disk is a slow disk, the IO does not return or the IO latency is relatively high and thus resulting in the IO activity of the VM get stuck or having longer response time.
As old version of HCI do not support the slow disk isolation function, when the physical hard disk becomes a slow disk, the disk cannot be isolated.  


How to check
All HDD disks have the possibility to become slow disks.


Solution
Recommended to upgrade the HCI version to 6.3.0R1 and above as the isolation of slow disk has been implemented. When HCI detected slow disks, it will be automatically isolated to avoid affecting the production.

Like this topic? Like it or reward the author.

Creating a topic earns you 5 coins. A featured or excellent topic earns you more coins. What is Coin?

Enter your mobile phone number and company name for better service. Go

Trending Topics

Board Leaders