bee
bee copied to clipboard
/stewardship checks don't check all chunks
Summary
I've had several times when /stewardship reports that a reference is retrievable, but yet a /bytes of that same reference from that same node fails. I finally tracked down the cause. The /stewardship API uses a traverser that creates a joiner using a net-only getter. But that joiner's processChunkAddresses doesn't actually GET all of the chunks. Every chunk is reported to the callback, but a short-circuit check with a continue at the following line prevents leaf chunks from actually being fetched from the network.
https://github.com/ethersphere/bee/blob/c752928aad48da75bc0200088f455c1cb4b31b0b/pkg/file/joiner/joiner.go#L260
When a reference is in this state, /stewardship will report that the reference is retrievable, but a curl command to actually get the file from that reference via /bytes will return an error curl: (18) transfer closed with 4110 bytes remaining to read
where N is variable depending on the file. If the /bytes API is used, the returned content is zero length although no error is thrown (a problem in bee-js, it seems).
Steps to reproduce
I would love to provide an example, but given the transient nature of the swarm at this point, it's pretty hard to have a reference that stays in this state. The root chunk must be retrievable, but one of the leaf chunks must not be.
Expected behavior
/stewardship should indicate non-retrievability if /bytes cannot retrieve the content.
Actual behavior
See above description.
Note: I did comment out the continue at the line indicated above and the chunks that are retrieved by the joiner used by /bytes are now then all actually retrieved by /stewardship.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
This issue was closed because it has been stalled for 5 days with no activity.
Issue is most likely related to some network instability at the time. Closing until this further evidence is found for this issue.
I will try to put together a concise demonstration of this issue. It will likely involve adding a log or two to show what chunks are being fetched from the network and which are simply used from the local cache without being actually fetched from the network. I'll re-open once I have this put together.