arroyo --no-strict-offset-reset doesn't always work

Steps to Reproduce

Not sure, just adding a placeholder for further information.

Expected Result

--no-strict-offset-reset would work, and enable consumers to reset their own offsets when out of retention.

it might be the combination of earliest and no-strict-offset-reset, latest would've probably worked

Actual Result

Not sure?

Oct 23 '24 17:10 mwarkentin

I think the combination of --auto-offset-reset=earliest and --no-strict-offset-reset sometimes ends in situations where arroyo resets to an offset that already expired, therefore failing to reset the offset

%4|1729638367.626|OFFSET|rdkafka#consumer-2| [thrd:main]: snuba-spans [62]: offset reset (at offset 70037080713 (leader epoch 4), broker 0) to offset BEGINNING (leader epoch -1): fetch failed due to requested offset not available on the broker: Broker: Offset out of range

Oct 23 '24 18:10 untitaker

@untitaker was that a one time thing? Or does the consumer continually attempt to reset to offsets that are already out of bounds?

Eg. is this an issue only on very high throughput topics, or ones where the consumer takes a while to commit the first batch?

Oct 23 '24 18:10 mwarkentin

I think it's a race condition in arroyo that allows this to happen. I think we should probably support constructs like --auto-offset-reset=earliest+1h to self-serve what we already end up doing manually

Oct 23 '24 18:10 untitaker

We should also reconsider (per our discussions) if --no-strict-offset-reset even needs to be a thing anymore now that we primarily use --auto-offset-reset=earliest for all of our consumers.

Oct 23 '24 19:10 mwarkentin