Allowing empty segments with no offset advancing
This is a followup from https://github.com/apache/pinot/issues/8929, but in the case of 0 data being consumed. We've since found a poor interaction between Pinot and our s3 lifecycling.
- we have a few partitions that have 0 data all the time (not filtered out, literally 0 events)
- Pinot keeps that consuming segment open indefinitely (in this case we noticed it had been 1 year)
- Pinot also keeps the last completed segment for each partition (in this case, it's from 2023)
- We wanted to use s3 lifecycling to delete all data < 30 days old (our tables had 10 day retention) rather than just rely on Pinot's mechanisms, but this sent segments into error state since they had been around for 1 year
Is there any reason we can't seal a segment where the offset hasn't advanced? In this case, we would have had N segments for this partition all with 0 records and the same start/end offset.
cc @Jackie-Jiang @priyen-stripe
I think we don't seal them right now because we don't support empty segment before. Since we can support empty segment now, we should be able to seal them. We want to revisit the timestamp used for empty segment (using current time should work) so that retention manager can remove them properly.