snuba icon indicating copy to clipboard operation
snuba copied to clipboard

snuba-subscription-consumer-events are running but are not consuming

Open welliamcao opened this issue 7 months ago • 7 comments

Environment

prod & test

What version are you running? Etc. Sentry 25.1.0

Actual Result

snuba-subscription-consumer-events container output is normal

root@localhost:/opt/apps/self-hosted# docker logs -f  --tail=10 sentry-self-hosted-snuba-subscription-consumer-events-1
2025-05-29 01:32:30,645 <snuba.subscriptions.scheduler_processing_strategy.TickBuffer object at 0x7f2aba5f7690> exited successfully, releasing assignment.
2025-05-29 01:32:30,646 Partition revocation complete.
2025-05-29 01:32:30,654 Stopped
2025-05-29 01:32:32,021 Initializing Snuba...
2025-05-29 01:32:38,311 Snuba initialization took 6.291913863271475s
2025-05-29 01:32:39,224 Initializing Snuba...
2025-05-29 01:32:45,444 Snuba initialization took 6.221630237996578s
{"module": "builtins", "event": "Checking Clickhouse connections", "severity": "info", "timestamp": "2025-05-29T01:32:45.471465Z"}
2025-05-29 01:32:48,499 New partitions assigned: {Partition(topic=Topic(name='snuba-commit-log'), index=0): 1, Partition(topic=Topic(name='snuba-commit-log'), index=1): 0}
2025-05-29 01:32:48,500 Member id: 'rdkafka-a09487b8-df02-42f0-84b0-4dc8fc592b50'

The kafka-consumer-groups command execution result shows that the snuba-events-subscriptions-consumers consumer is not consuming normally and the lag value is empty.

[appuser@287916e674b4 ~]$ kafka-consumer-groups --bootstrap-server localhost:9092 --group snuba-events-subscriptions-consumers -describe

GROUP                                TOPIC            PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                  HOST            CLIENT-ID
snuba-events-subscriptions-consumers snuba-commit-log 0          -               2               -               rdkafka-a09487b8-df02-42f0-84b0-4dc8fc592b50 /172.31.0.50    rdkafka
snuba-events-subscriptions-consumers snuba-commit-log 1          -               0               -               rdkafka-a09487b8-df02-42f0-84b0-4dc8fc592b50 /172.31.0.50    rdkafka

Expected Result

  1. snuba-events-subscriptions-consumers working normal
  2. alerts working normal (I guess it's because snuba-events-subscriptions-consumers is not working.)
  3. how can I debug this problem?

welliamcao avatar May 29 '25 02:05 welliamcao

Environment

prod & test

What version are you running? Etc. Sentry 25.1.0

Actual Result

snuba-subscription-consumer-events container output is normal

root@localhost:/opt/apps/self-hosted# docker logs -f  --tail=10 sentry-self-hosted-snuba-subscription-consumer-events-1
2025-05-29 01:32:30,645 <snuba.subscriptions.scheduler_processing_strategy.TickBuffer object at 0x7f2aba5f7690> exited successfully, releasing assignment.
2025-05-29 01:32:30,646 Partition revocation complete.
2025-05-29 01:32:30,654 Stopped
2025-05-29 01:32:32,021 Initializing Snuba...
2025-05-29 01:32:38,311 Snuba initialization took 6.291913863271475s
2025-05-29 01:32:39,224 Initializing Snuba...
2025-05-29 01:32:45,444 Snuba initialization took 6.221630237996578s
{"module": "builtins", "event": "Checking Clickhouse connections", "severity": "info", "timestamp": "2025-05-29T01:32:45.471465Z"}
2025-05-29 01:32:48,499 New partitions assigned: {Partition(topic=Topic(name='snuba-commit-log'), index=0): 1, Partition(topic=Topic(name='snuba-commit-log'), index=1): 0}
2025-05-29 01:32:48,500 Member id: 'rdkafka-a09487b8-df02-42f0-84b0-4dc8fc592b50'

The kafka-consumer-groups command execution result shows that the snuba-events-subscriptions-consumers consumer is not consuming normally and the lag value is empty.

[appuser@287916e674b4 ~]$ kafka-consumer-groups --bootstrap-server localhost:9092 --group snuba-events-subscriptions-consumers -describe

GROUP                                TOPIC            PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                  HOST            CLIENT-ID
snuba-events-subscriptions-consumers snuba-commit-log 0          -               2               -               rdkafka-a09487b8-df02-42f0-84b0-4dc8fc592b50 /172.31.0.50    rdkafka
snuba-events-subscriptions-consumers snuba-commit-log 1          -               0               -               rdkafka-a09487b8-df02-42f0-84b0-4dc8fc592b50 /172.31.0.50    rdkafka

Expected Result

  1. snuba-events-subscriptions-consumers working normal
  2. alerts working normal (I guess it's because snuba-events-subscriptions-consumers is not working.)
  3. how can I debug this problem?

i try use kafka-console-consumer is work, read topic message success.

[appuser@287916e674b4 ~]$ kafka-topics --bootstrap-server localhost:9092 --topic snuba-commit-log --describe
Topic: snuba-commit-log	TopicId: Mq6X0Yh7QrKY3a29zECFkg	PartitionCount: 1	ReplicationFactor: 1	Configs: max.message.bytes=50000000
	Topic: snuba-commit-log	Partition: 0	Leader: 1001	Replicas: 1001	Isr: 1001
[appuser@287916e674b4 ~]$ kafka-consumer-groups --bootstrap-server localhost:9092 --group snuba-events-subscriptions-consumers -describe

GROUP                                TOPIC            PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                  HOST            CLIENT-ID
snuba-events-subscriptions-consumers snuba-commit-log 0          1               2               1               rdkafka-274414e6-ca01-4055-9c90-0366e01cf7a2 /172.31.0.50    rdkafka
[appuser@287916e674b4 ~]$ kafka-console-consumer --bootstrap-server localhost:9092 --topic snuba-commit-log --from-beginning
{"offset": 50, "orig_message_ts": 1748487680.038, "received_p99": null}
{"offset": 51, "orig_message_ts": 1748487909.961, "received_p99": null}
{"offset": 52, "orig_message_ts": 1748488319.386, "received_p99": null}

Run output content in debug mode

root@localhost:/opt/apps/self-hosted# docker exec -it sentry-self-hosted-snuba-subscription-consumer-events-1 bash
snuba@361ec265b42f:/usr/src/snuba$ snuba subscriptions-scheduler-executor --dataset events --entity events --auto-offset-reset=latest --no-strict-offset-reset --consumer-group=snuba-events-subscriptions-consumers --followed-consumer-group=snuba-consumers --schedule-ttl=60 --stale-threshold-seconds=900  --log-level debug
2025-05-29 03:59:38,742 Initializing Snuba...
2025-05-29 03:59:45,002 Snuba initialization took 6.260880544781685s
{"module": "builtins", "event": "Checking Clickhouse connections", "severity": "info", "timestamp": "2025-05-29T03:59:45.027375Z"}
2025-05-29 03:59:45,032 Connecting. Database: default. User: default
2025-05-29 03:59:45,032 Connecting to 10.23.14.207:9000
2025-05-29 03:59:45,033 Connected to ClickHouse server version 23.8.11, revision: 54465
2025-05-29 03:59:45,034 Query: SELECT version()
2025-05-29 03:59:45,034 Block "" send time: 0.000038
2025-05-29 03:59:45,040 Starting
2025-05-29 03:59:45,951 New partitions assigned: {Partition(topic=Topic(name='snuba-commit-log'), index=0): 0}
2025-05-29 03:59:45,951 Member id: 'rdkafka-8d628e0a-89c1-4685-9129-8eb065d6eea6'
2025-05-29 03:59:45,951 Initialized processing strategy: <snuba.subscriptions.scheduler_processing_strategy.TickBuffer object at 0x7f7a742795d0>
2025-05-29 03:59:46,955 Waited 0.0000 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x7f7a86004750>.
2025-05-29 03:59:47,955 Waited 0.0000 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x7f7a86004750>.
2025-05-29 03:59:48,956 Waited 0.0000 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x7f7a86004750>.
2025-05-29 03:59:49,956 Waited 0.0000 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x7f7a86004750>.
2025-05-29 03:59:50,956 Waited 0.0000 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x7f7a86004750>.
2025-05-29 03:59:51,957 Waited 0.0000 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x7f7a86004750>.
2025-05-29 03:59:52,957 Waited 0.0000 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x7f7a86004750>.
2025-05-29 03:59:53,957 Waited 0.0000 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x7f7a86004750>.

welliamcao avatar May 29 '25 03:05 welliamcao

why always is “No alerts triggered during this time”

Image Image

welliamcao avatar May 29 '25 10:05 welliamcao

Same issue here using snuba 25.4.0

christiangonre avatar Jun 02 '25 14:06 christiangonre

Hi, ran into this as well on 25.5.1. I tried to reset everything by turning off all the commit log consumers, nuking all the groups and turning the consumers back on. They connected but didn't set any offsets as they normally do.

But there's no Kafka lag in my monitoring anymore, so it doesn't bother me that much.

Logs are the same as in the comment above, just Waited xyz seconds....

jiriks74 avatar Sep 04 '25 08:09 jiriks74

Unsurprisingly the issue appeared again.

jiriks74 avatar Sep 09 '25 11:09 jiriks74

I seem to have fixed this by running the consumer with scheduler and executor split. The lag is gone and now I see messages like

2025-09-09 12:00:21,545 Waited 0.0026 seconds for offsets to be committed to <snuba.subscriptions.scheduler_consumer.CommitLogTickConsumer object at 0x782e882de590>.
2025-09-09 12:00:22,294 Committing offset - no subscriptions: {Partition(topic=Topic(name='snuba-transactions-commit-log'), index=0): 1488196}

The offset is advancing though the consumers don't seem to be doing anything at least not yet. I'll see whether this changes after the consumer goes through the whole lag.

jiriks74 avatar Sep 09 '25 12:09 jiriks74

I just tested this in my staging environment and when the scheduler and executor are separate they start working (once I figured out how to generate the subscription events myself).

Committing offset: {Partition(topic=Topic(name='snuba-transactions-commit-log'), index=0): 14930}

jiriks74 avatar Sep 09 '25 13:09 jiriks74