ozone HDDS-10136. Recon displaying DELETED container as missing.

HDDS-10136. Recon displaying DELETED container as missing.

Open ArafatKhan2198 opened this issue 8 months ago • 1 comments

What changes were proposed in this pull request?

Root Cause

The main problem is with how container state transitions are handled. During the deletion process, if a container is in the DELETING state, Recon might mark it as MISSING due to no healthy replicas being reported. This happens because Recon checks the container state periodically, and during the sync delay, the state change to DELETED might not be reflected immediately.

Known Behavior

This is a known behavior and is generally not a cause for concern. The discrepancy is temporary and resolves itself once the ContainerHealthTask in Recon synchronizes the container state with SCM. The brief period where a container is shown as MISSING despite being DELETED in SCM is due to the inherent delay in state synchronization between Recon and SCM.

Solution

The solution involves modifying the ContainerHealthTask in Recon to handle these states correctly:

Skip DELETING Containers: When Recon finds a container marked as MISSING, it now checks if the container is actually in the DELETED state in SCM. If so, it skips any further processing for that container.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10136

How was this patch tested?

UT's

Jun 24 '24 17:06 ArafatKhan2198

ozone ozone copied to clipboard

HDDS-10136. Recon displaying DELETED container as missing.

What changes were proposed in this pull request?

Root Cause

Known Behavior

Solution

What is the link to the Apache JIRA

How was this patch tested?

ozone
ozone copied to clipboard