go-car icon indicating copy to clipboard operation
go-car copied to clipboard

consider easier ways to let users use a CAR index directly

Open mvdan opened this issue 2 years ago • 2 comments

Right now, an easy way to transparently use the index is to go through the blockstore package. Ideally we don't want that to be the main option, though - many use cases don't need a blockstore and its extra abstraction layer.

It should be possible to efficiently use an index from a CAR file on disk via OpenReader. The best option right now is to go through Reader.IndexReader and index.ReadFrom. Unfortunately, it has some shortcomings:

  1. Always loads the entire index into memory. This is rather wasteful if one just wants to loop over all index entries once, for example. Plus, OpenReader already uses an mmap, which allows for fast sequential or random access.

  2. No ability to inspect information about the index. For example, how do I tell if a CAR file has a multihash sorted index? Right now the API just allows this by loading the entire index into memory.

  3. Has some footguns; for example, it's a bit too easy to call index.ReadFrom straight on Reader.IndexReader, forgetting that it may be nil.

I'm working around these in the indexer provider, but I'll probably backport some of it in the form of reusable APIs.

This issue would enable https://github.com/ipld/go-car/issues/222, I think. It's unclear to me if/how https://github.com/ipld/go-car/issues/95 is related.

cc @masih @willscott

mvdan avatar Sep 27 '21 15:09 mvdan