automq icon indicating copy to clipboard operation
automq copied to clipboard

[BUG] The metric of s3 object count keeps increasing with IllegalStateException in log when no producer&consumer

Open keashem opened this issue 10 months ago • 1 comments

Version & Environment

based on automq 1.2.2,3 brokers were running in 3 node seperatly。the S3 WAL and S3 storage used CubeFS

cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"

What went wrong?

The metric of s3 object count keeps increasing with IllegalStateException in log when no producer&consumer

Image

[2025-02-18 21:09:18,270] ERROR open test-topic-40 failed, retry open after 1s (kafka.log.streamaspect.ElasticUnifiedLog$) java.lang.IllegalStateException at kafka.log.streamaspect.ElasticLogLoader.deleteSegmentsIfLogStartGreaterThanLogEnd$1(ElasticLogLoader.scala:129) at kafka.log.streamaspect.ElasticLogLoader.recoverLog(ElasticLogLoader.scala:182) at kafka.log.streamaspect.ElasticLogLoader.load(ElasticLogLoader.scala:69) at kafka.log.streamaspect.ElasticLog$.apply(ElasticLog.scala:710)

[2025-02-18 21:09:18,270] ERROR [ElasticLog partition=test-topic-17 epoch=79] failed to open elastic log, trying to close streams. (kafka.log.streamaspect.ElasticLog$) java.lang.IllegalStateException at kafka.log.streamaspect.ElasticLogLoader.deleteSegmentsIfLogStartGreaterThanLogEnd$1(ElasticLogLoader.scala:129) at kafka.log.streamaspect.ElasticLogLoader.recoverLog(ElasticLogLoader.scala:182) at kafka.log.streamaspect.ElasticLogLoader.load(ElasticLogLoader.scala:69) at kafka.log.streamaspect.ElasticLog$.apply(ElasticLog.scala:710)

What should have happened instead?

The metric of s3 object count shouldn't increasing when no producer&consumer

How to reproduce the issue?

Additional information

Please attach any relevant logs, backtraces, or metric charts.

keashem avatar Feb 21 '25 02:02 keashem

Hi, @keashem, thanks for your feedback. We'll look into this issue soon, also feel free to submit a pull request to fix it. And, managing an AutoMQ cluster can be challenging, so if you need any support from our team, we're happy to help. We can set up a WeChat group or Slack channel for you.

daniel-y avatar Feb 26 '25 03:02 daniel-y