helia
helia copied to clipboard
Multiple/alternative data retrieval implemenations
Currently Helia takes a blockstore that it enhances with bitswap. This creates a hard dependency on bitswap.
To enable experimentation and adoption of faster/more use-case specific retrieval protocols (cars, graphsync, XYZNewFutureProtocol etc) we should allow this to be a configuration option.
At this point blocks may not be the correct abstraction since it limits us to a block as the unit of data you get in response to a CID.
A better read abstraction might be a CID
to a stream of Uint8Array
s? The the underlying retrieval method can apply whatever optimisations it can to fetch the data quickly and the calling code doesn't have to keep going back to fetch another block for another CID.
interface Options {
offset?: number
length?: number
}
interface ContentReader {
get (cid: CID, options: Options): AsyncGenerator<Uint8Array>
}
Questions:
- Does this shift complexity of interpreting block data on to the content reader?
- What does the writer interface look like?
- Can the writer/reader interfaces be asymmetric? E.g. CIDs/Blocks in, CID/Stream out?
- Does this assume file data?
- What about structures like unixfs where the root block has file metadata and then file data in leaf nodes?
- If DAGs are all
dag-pb
,dag-cbor
ordag-json
we can make some assumptions about structure?