warchaeology icon indicating copy to clipboard operation
warchaeology copied to clipboard

feat: WARC record iterator

Open maeb opened this issue 10 months ago • 1 comments

The gowarc.WarcFileReader does not provide a simple way to get the size of the next record.

This PR introduces an iterator abstraction for iterating over WARC records. In addition to returning the size of each record the iterator encapsulates:

  • filtering records
  • limiting the number of records
  • selecting the nth record

~This PR only includes the implementation of the iterator. Future pull requests will start using it.~ edit: added use cases

maeb avatar Apr 10 '24 14:04 maeb