ozone
ozone copied to clipboard
HDDS-10372. SCM and Datanode communication for reconciliation
What changes were proposed in this pull request?
A lot of boilerplate code to do something very simple:
- Tell SCM to start reconciliation for a container from the CLI.
- Have SCM tell Datanodes to reconcile that container with their peers.
- Datanodes send back a placeholder container data checksum which we can fill in with reconciliation implementation later.
- There is no communication between datanodes added in this change.
- SCM updates its replica info based on the container report received after the Datanodes reconcile.
I've tried to avoid making any design related decisions in this PR. It is intended as a skeleton we can use to plug in the reconciliation implementation for end to end testing in future changes.
In scope for this change
- Add new
ozone admin container reconcile <container-id>
command. - New command should be restricted to admins
- Audit logging for new command
- Blocking reconciliation of invalid containers (EC, 1 replica, still open)
- Datanode queue metrics for reconciliation commands
- Datanode and SCM application logs to follow the command as it moves through the system.
- SCM saves container replicas' data checksums in memory, and they can be retrieved with
ozone admin container info --json
Out of scope for this change (but will be handled in later tasks)
- Any actual checksum related implementations
- Currently byte strings are used as placeholders just to move filler data around for testing.
- Recon integration with container data checksums
- This includes Recon's
ContainerReplicaHistoryProto
- This includes Recon's
- Finalized protobuf changes
- Since the change is going to a feature branch we have the flexibility to evolve the protos later.
- Good UX 😄
- This includes flags for the
reconcile
command, an easy way to track reconciliation progress, and reading containers from stdin like othercontainer
subcommands support. - These will need some discussion so are probably best done as their own set of changes.
- This includes flags for the
- The following tasks have been moved out to do in follow up changes:
- HDDS-10714 datanode status filtering for reconciliation peers and targets
- HDDS-10759 Consider allowing reconciliation when not all replicas have reached closed state
- HDDS-10760 SCMExceptions resulting from admin CLI commands are treated as retriable
What is the link to the Apache JIRA
HDDS-10372
How was this patch tested?
- Acceptance test for CLI added
- Manually tested the CLI with valid and invalid containers. Also manually checked SCM audit logging
- Unit and integration tests added in the following classes:
-
TestReconcileContainerEventHandler
: Tests SCM's filtering of reconciliation requests based on eligible container and replica states. When containers are eligible, tests that reconcile commands are sent to datanodes. -
TestStateContext
: Tests that the new command shows up in datanode queue metrics. -
TestReconcileContainerCommandHandler
: Tests datanode queue and runtime metrics when a reconcile command is received. Also tests that the ICR sent as the result of the command has the expected data checksum. -
TestContainerDataYaml
: Tests that the data checksum is not written to the .container file. Merkle tree information will be written to its own file in a different change. -
TestHeartbeatEndpointTask
: Tests that datanodes add a reconcile command to their queue when it is received on an SCM heartbeat response. -
TestKeyValueHandler
: Tests that the theKeyValueHandler
triggers an ICR back to SCM with the expected values when reconciliation is invoked. -
TestContainerReportHandler
,TestIncrementalContainerReportHandler
: Tests that SCM correctly saves replicas' data checksum information it receives on a heartbeat.
-
At last the CI is green