automq
automq copied to clipboard
AKIP-1: Object storage failover
Status
Under Discussion
Motivation
To address the issue of regional-level unavailability in object storage services, ensure the write availability and partition migration availability of the Kafka cluster through object storage disaster recovery.
Public Interfaces
Config
- Kafka config change
s3.endpoint=main-endpoint,1@xxx-endpoint
s3.region=main-region,1@cn-xxx
s3.bucket=main-bucket,1@failover-xxx-bucket
- environment config change
KAFKA_S3_ACCESS_KEY=main-ak,1@xxx
KAFKA_S3_SECRET_KEY=main-sk,1@xxx
1 represents the number of the failover bucket. The default number of the main bucket is 0.
Proposed Changes
Brief
Include a failover object storage bucket as a backup. This failover bucket can be located in other object storage clusters or nearby regions. If the main bucket becomes unavailable for writing, the system will automatically write to the failover bucket to ensure that the object storage write service remains available.
Detail
Metadata
The S3Object
metadata adds the bucket field, indicating the storage bucket location of the object.
Read
DefaultS3BlockCache#readFromS3
reads data from object storage and needs to propagate the bucket identifier to DefaultS3Operator
through the context.
Then DefaultS3Operator
will select the corresponding bucket based on the identifier for reading.
Write
Increase object storage availability detection mechanism or manual switch to mark the main bucket's availability.
The ObjectWriter
creates a context to keep track of the bucket to which it writes.
(Why not automatically implement fault tolerance and retries within ObjectWriter? After the completion of uploading a part, the corresponding Bytebuf is released. If the object storage becomes unavailable after this point, retries cannot be performed.)
In non-disaster recovery mode, DeltaWALUploadTask
selects the main bucket as the upload bucket. If the main bucket fails to upload and the broker enters a disaster recovery mode during the retry process, the failover bucket is used as the upload bucket to trigger object re-upload.
Others
Compaction task: Only run in non-disaster recovery mode to avoid excessive cross-region read and write operations.
Task list
- https://github.com/AutoMQ/automq-for-kafka/issues/478
- https://github.com/AutoMQ/automq-for-kafka/issues/479
- https://github.com/AutoMQ/automq-for-kafka/issues/480
- https://github.com/AutoMQ/automq-for-kafka/issues/481
- https://github.com/AutoMQ/automq-for-kafka/issues/482
- https://github.com/AutoMQ/automq-for-kafka/issues/483