claircore icon indicating copy to clipboard operation
claircore copied to clipboard

Seekable tarfs

Open hdonnay opened this issue 3 years ago • 10 comments

This adds support for variants of layers that don't need a whole layer decompressed.

TODO:

  • [x] Back out 1.19-new API
  • [x] Add ReaderAt HTTP implementation
  • [x] Back out httpreader buffering

hdonnay avatar Aug 22 '22 16:08 hdonnay

cc @giuseppe This zstd:chunked implementation is largely based on reverse-engineering the code in containers/storage. Is there more formal documentation or test fixtures that I can cross-reference against?

hdonnay avatar Aug 22 '22 17:08 hdonnay

Note to self: using linux's non-POSIX record locks and a file-reopening scheme would allow the fs.SubFS interface to be implemented directly, but would require some thought about how that interacts with a needed Close call, which doesn't exist on the returned fs.FS.

hdonnay avatar Aug 22 '22 19:08 hdonnay

cc @giuseppe This zstd:chunked implementation is largely based on reverse-engineering the code in containers/storage. Is there more formal documentation or test fixtures that I can cross-reference against?

@hdonnay, the zstd:chunked is left undocumented on purpose at the moment so that I can still break the format if needed :-)

Some high-level information is present in the original PR that added support for it: https://github.com/containers/storage/pull/775

Out of curiosity: how are you going to use these formats from Claire?

giuseppe avatar Aug 24 '22 11:08 giuseppe

ah, okay. Well I feel like I've got a handle on it, so I can help with the documentation when the time comes.

Clair can use this to reduce disk utilization at the cost of latency. This would be helpful for large layers where contents are full of content that our indexers don't care about.

hdonnay avatar Aug 24 '22 14:08 hdonnay

Decided to punt any httpreader usage to a future PR.

The tricky things for that future PR will be:

  • New fetcher implementation
  • New Indexer interfaces (either on the indexers themselves or on the Layer, somehow) to support using a single backing fs.FS
  • Fixing current users of (*Layer).Reader and (*Layer).SetLocal

hdonnay avatar Feb 06 '23 22:02 hdonnay

This is also going to be easier to review commit-wise, sorry.

hdonnay avatar Feb 06 '23 22:02 hdonnay

Codecov Report

Attention: 273 lines in your changes are missing coverage. Please review.

Files Coverage Δ
layer.go 70.54% <100.00%> (+2.27%) :arrow_up:
pkg/tarfs/tarfs.go 86.66% <100.00%> (+12.85%) :arrow_up:
pkg/tarfs/metrics.go 84.61% <84.61%> (ø)
pkg/tarfs/pool.go 84.00% <84.00%> (ø)
pkg/tarfs/file.go 81.08% <83.33%> (-7.50%) :arrow_down:
pkg/tarfs/randomaccess.go 58.82% <58.82%> (ø)
pkg/tarfs/fs.go 68.09% <68.09%> (ø)
pkg/tarfs/parse.go 53.25% <57.79%> (-0.95%) :arrow_down:
pkg/tarfs/srv.go 74.80% <74.80%> (ø)

... and 3 files with indirect coverage changes

:loudspeaker: Thoughts on this report? Let us know!.

codecov[bot] avatar Sep 07 '23 22:09 codecov[bot]

Decided to punt any httpreader usage to a future PR.

The tricky things for that future PR will be:

* New fetcher implementation

* New Indexer interfaces (either on the indexers themselves or on the Layer, somehow) to support using a single backing `fs.FS`

* Fixing current users of `(*Layer).Reader` and `(*Layer).SetLocal`

The implementation in #1061 addresses most of these structural problems.

hdonnay avatar Oct 09 '23 22:10 hdonnay

Going to re-draft this until after #1061 and the subsequent Scanner touching to be able to avoid all of the call-side rewrites that are currently in this PR.

hdonnay avatar Oct 09 '23 22:10 hdonnay

#1061 and #1105 addressed the ergonomics of this API change, so I think this is good to get reviewed.

hdonnay avatar Oct 26 '23 19:10 hdonnay