[SCP] VM transmissions failed from DC site to DR site.

|
  • 115
  • 0

Issue Description

In some cases, users reported that the VM transmissions are stuck on SCP DC site. But no running or failed tasks were found in respective VM via the HCI UI.

Error/Warning Information

Tranmission failed, Please check and fix the related issues before trying again.

Handling Process

1) Check on the respective VM's tasks, to verify any tasks are running or not. (Eventually no failed tasks were found)


2) Enter SCP backend, check on the gecko-config logs, there were some errors that were related to backup transimissions.

Root Cause

Suspected the VM Transmissions were stuck during the DR policy execution.

Solution

1. On the DC HCI site in backend, vs_cluster_cmd e "ps auxf | grep UPID'.
(This is to check to which processes are running in each node concurrently.)

2. Enter ps auxf | grep rsync . (Rsync is common for operating systems to synchronize files and directories between different locations.)

The running rsync process is shown with PID of 17270.


3. As following to the result from above, there is one rsync process which PID is 17270 that is currently stuck. It is suspected that this process is to cause the VM transmission to fail. In this case, we will then attempt to kill the rsync process with the corresponding PID.

kill -9 17270

4. Apply the steps above to each nodes that have the rsync process running, and kill each of them accordingly.

5. After killing the related rsync processes via backend, the VM transmissions are able to be executed normally.

Suggestions

Refer to the handling method.

I want to write a case
Doc ID: 8956
Author: Edward Ma
Updated: 2023-07-20 11:19
Version: