The value of the kafka_stream_block_cache_size_bytes metric is different from the value of the s3.block.cache.size setting
We configured s3.block.cache.size=10737418240 (10GB) in server.properties, but the value of the kafka_stream_block_cache_size_bytes indicator seen in the s3 metrics exporter is different from the value set for s3.block.cache.size. The size shown by the indicator is less than 1G and it keeps changing. I would like to ask why this is the case.
- The
s3.block.cache.sizeis the maximum size of data that BlockCache can cache. - BlockCache only caches the data that is unread or read-ahead. It will drop the non-useful data that is read.
@superhx Thanks for your reply. If it is as you said, then this problem is even more strange. The size of the kafka_stream_block_cache_size_bytes metric has never exceeded the 10GB we set, but many warns like the following keep appearing in the log:
[2025-07-11 01:28:46,776] WARN [SUPPRESSED_TIME=27] The unread block is evicted, please increase the block cache size (com.automq.stream.s3.cache.blockcache.StreamReader)
[2025-07-11 01:26:15,596] WARN [SUPPRESSED_TIME=28] The unread block is evicted, please increase the block cache size (com.automq.stream.s3.cache.blockcache.StreamReader)
[2025-07-11 01:04:45,683] WARN [SUPPRESSED_TIME=28] The unread block is evicted, please increase the block cache size (com.automq.stream.s3.cache.blockcache.StreamReader)
@jerome-j-20230331 The cached DataBlock will be evicted after (createTimestamp + 1min). So it may be caused by the consumer reading too slowly to consume the data that is pre-read from S3.
Current log is a little bit misleading. The BlockCache should log different logs for different DataBlock evicted events:
- Cache size isn't enough
- DataBlock TTL reaches
https://github.com/AutoMQ/automq/blob/0c1a1964194ee42aca8ac4890b617dae55027af1/s3stream/src/main/java/com/automq/stream/s3/cache/blockcache/DataBlockCache.java#L236-L258
Welcome to submit a PR to fix it.
@superhx Thank you very much for your answer. In fact, we are indeed facing a big consumption lag problem. We have been trying to adjust the block cache size to optimize it. Because of the log output, we always thought it was caused by insufficient memory. But I am just an operation and maintenance engineer, and I am not good at Java programming. I am sorry that I can't help contribute a little to the community.
@superhx Thank you very much for your answer. In fact, we are indeed facing a big consumption lag problem. We have been trying to adjust the block cache size to optimize it. Because of the log output, we always thought it was caused by insufficient memory. But I am just an operation and maintenance engineer, and I am not good at Java programming. I am sorry that I can't help contribute a little to the community.
The consumer lag should be fixed in consumer side. You can refer to this blog to get BlockCache performance.