redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

archival: return error when offset not found in segment

Open abhijat opened this issue 2 years ago • 1 comments

Cover letter

When scanning through a segment to find the given offset, it is possible the offset is not found in the segment. In this case the changes in this PR return an error and an upload candidate is not created.

Changes in this PR:

  1. Move segment offset search to storage
  2. Tests for offset search
  3. Return error when offset not in segment
  4. Slight change to returning max timestamp in corner case during end offset search

Fixes #6990

Backport Required

  • [ ] not a bug fix
  • [ ] issue does not exist in previous branches
  • [ ] papercut/not impactful enough to backport
  • [x] v22.3.x
  • [ ] v22.2.x
  • [ ] v22.1.x

UX changes

  • none

Release notes

  • none

abhijat avatar Nov 14 '22 15:11 abhijat

It looks like some existing unit tests like test_archival_policy_timeboxed_uploads started failing when we returned error if the offset is not in segment.

abhijat avatar Nov 16 '22 14:11 abhijat

In the compacted case, I don't have a good intuition for what's meant to happen when we try to look up an offset and don't find it: is there a place we're retrying later with a different offset?

For compacted uploads the compacted segment collector utility does its own set of preliminary checks where the manifest boundaries and gaps are examined before determining the start and end offset.

So we should not encounter a situation where the offsets we end up using as start and end are missing (especially as the offset search inside segment just does a less than or equal to comparison, not checking if we have the exact offset or not).

If we do run into that case because of truncation then the segment collector on next run fast forwards the begin offset to the start of the manifest and the search starts from there.

abhijat avatar Nov 25 '22 10:11 abhijat

I think we should make changes to return type in a different PR to reduce the change surface area but happy to take a stab at it now if you think otherwise.

That's fine with me.

jcsp avatar Nov 25 '22 10:11 jcsp

If we do run into that case because of truncation then the segment collector on next run fast forwards the begin offset to the start of the manifest and the search starts from there.

Cool, that was the piece I was wondering about.

jcsp avatar Nov 25 '22 10:11 jcsp

/backport v22.3.x

abhijat avatar Nov 25 '22 15:11 abhijat