azure-sdk-for-java
azure-sdk-for-java copied to clipboard
Storage Content Validation - Encoder Performance Improvements
Summary
This PR updates StructuredMessageEncoder.java to return a reactive stream (Flux<ByteBuffer>) of encoded chunks rather than building and returning a single contiguous byte[]. The wire format (headers, segment layout, endianness, CRC fields) remains unchanged. The primary benefits are lower peak memory usage, improved throughput for large payloads, and better downstream flow control.
Key Changes in StructuredMessageEncoder.java
1) Public API: byte[] → Flux<ByteBuffer>
-
Old:
public byte[] encode(ByteBuffer unencodedBuffer)generated and returned a fullbyte[]. -
New:
public Flux<ByteBuffer> encode(ByteBuffer unencodedBuffer) - Encoded chunks are produced lazily (e.g.,
Flux.defer(...)) and can be processed incrementally or collected when a contiguous buffer is required.
2) Reactive Error Signaling
- Validation errors (e.g., idempotency violations, content-length bounds) now propagate via terminal stream errors (
Flux.error(...)) instead of synchronous exceptions, aligning with reactive consumption patterns.
3) Emission Path
- The encoder preserves the existing incremental layout (header → per-segment header/content/footer → footer) while emitting those parts directly as
ByteBufferitems, avoiding aByteArrayOutputStreamand the final monolithic array allocation.
4) Wire-Format Consistency
-
Endianness: Numeric fields (segment number
short, sizes/CRCslong) remain LITTLE_ENDIAN. -
CRC64: When
StructuredMessageFlags.STORAGE_CRC64is set, segment footers and the message footer include CRC64longvalues; otherwise, CRC fields are omitted as before. -
Layout constants: Existing constants (e.g.,
V1_HEADER_LENGTH,V1_SEGMENT_HEADER_LENGTH,CRC64_LENGTH) and segment sizing logic are retained to ensure identical binary output.
Motivation
-
Lower Peak Memory: Avoids allocating a large contiguous
byte[]for big payloads by streaming chunks. - Throughput & Backpressure: Downstream consumers can start processing as data is produced, improving end-to-end latency and memory pressure.
Tests
-
MessageEncoderTests.javaare updated to collect theFlux<ByteBuffer>and assert the same structural, CRC, and error-case behaviors as before.
API Change Check
APIView identified API level changes in this PR and created the following API reviews