bee
bee copied to clipboard
Add option to skip traversal with stewardship endpoint
Summary
It would be good if the /stewardship
GET endpoint could have an optional parameter so that the traversal of the data is skipped, therefore it would be possible to check only a single chunk availability.
More context in #3205
Motivation
I wanted to write a tool that can check if the individual chunks of a dataset are available on the network and wanted to use the /stewardship
GET endpoint for that. However it turned out that it has additional logic in it and it recognizes root chunks and immediate chunks or manifest root chunks, and then traverses all the chunks that belong to the data set. That way the checks can become very expensive and requires additional logic on the user's side to differentiate between different chunks.
Implementation
There could be an optional query parameter (e.g. traverse=false
or skipTraversal
or something like that) when specified then would skip the traversal logic and would just simply try to fetch the given chunk from the network.
I created an example implementation that does this in the https://github.com/ethersphere/bee/tree/feat/stewardship-skip-traversal branch, but I understand that it is not production quality, so I don't expect it to be merged.
Actually, there are 3 different use cases for the /stewardship API, both GET and PUT.
- Current operation which traverses an entire manifest if the reference "smells" like one, and also traverses all of the chunks of a non-manifest /bytes reference (BMT joiner). Really only useful for small manifests or files.
- An option that only does the full /bytes reference (BMT joiner), but does NOT traverse the manifest. Useful for clients that do their own explicit mantaray manifest processing. (https://github.com/ethersphere/mantaray-js)
- The option described above which only checks the exact specified chunk. Useful for clients that do their own BMT processing (https://github.com/fairDataSociety/bmt-js)
Both myself and @mfw78 are doing 2 with our large manifests on the swarm.
Since it is actually possible to retrieve single chunks using the /stewardship endpoint, we will for now close this issue.
I disagree. If you hit the /stewardship endpoint with a chunk address that happens to be the root reference of a mantaray manifest, it will traverse and process the ENTIRE manifest. Unless I'm missing something in the API that constrains it to a single chunk?
I see the point now. No, we do not have a query for this yet. It should be trivial to add though.