Last edited by CTI TF 10 May 2022 18:50.
When the HDD has been used for more than 2 years, the probability of hardware failure is increasing. Among them, "slow disk" is the most common failure of HDD.
After the HDD becomes a slow disk, the IO request does not return or the IO latency is high, resulting in the virtual machine not responding or slow in performance and thus affecting customer production.
The HDD disk has a hardware failure, resulting in no return of IO request or high IO latency. From experience, new hard disks within 3 months and hard disks that have been used for more than 2 years have a higher probability of occurrence.
After the HDD disk becomes a slow disk, the disk is still online, while the read and write operation is still continue on this disk. However, due to this disk is a slow disk, the IO does not return or the IO latency is relatively high and thus resulting in the IO activity of the VM get stuck or having longer response time.
As old version of HCI do not support the slow disk isolation function, when the physical hard disk becomes a slow disk, the disk cannot be isolated.
How to check
All HDD disks have the possibility to become slow disks.
Recommended to upgrade the HCI version to 6.3.0R1 and above as the isolation of slow disk has been implemented. When HCI detected slow disks, it will be automatically isolated to avoid affecting the production.