bee icon indicating copy to clipboard operation
bee copied to clipboard

Tagging does not work on direct BZZ upload

Open nugaon opened this issue 7 months ago • 5 comments

⚠️ Support requests in an issue-format will be closed immediately. For support, go to Swarm's Discord.

Context

Bee 2.5

Summary

When attempting to upload data to the /bzz endpoint using a pre-generated tag (via the Tag API), the tagging functionality does not work if the swarm-deferred-upload: false header is set (for direct uploads).

Expected behavior

The tag should track the upload progress and reflect the correct chunk counts as the upload proceeds and completes.

Actual behavior

The tag does not update the upload status when swarm-deferred-upload: false is set.

Steps to reproduce

  1. Generate a tag

curl -X POST http://localhost:1633/tags

Note the uid from the response.

  1. Attempt to upload a file with direct upload and tagging
curl -X POST \
  -H "swarm-deferred-upload: false" \
  -H "Content-Type: application/x-tar" \
  -H "swarm-postage-batch-id: <BATCH_ID>" \
  -H "swarm-tag: <TAG_ID>" \
  --data-binary @my_data.tar \
  http://localhost:1633/bzz
  1. Check the tag status

curl http://localhost:1633/tags/<TAG_ID>

nugaon avatar May 12 '25 07:05 nugaon

Isn't tracking with tags for deferred uploads only? For deferred uploads the reference hash will be returned immediately, it won't wait for the upload to complete. That's why tags are needed, so we can then confirm whether the upload was successful.

But for direct uploads, the reference hash is not returned until the data has been completely uploaded and synced successfully, so there is no need to track it with a tag.

At least that is how I understood it is supposed to work?

NoahMaizels avatar May 14 '25 15:05 NoahMaizels

Isn't tracking with tags for deferred uploads only? For deferred uploads the reference hash will be returned immediately, it won't wait for the upload to complete. That's why tags are needed, so we can then confirm whether the upload was successful.

But for direct uploads, the reference hash is not returned until the data has been completely uploaded and synced successfully, so there is no need to track it with a tag.

At least that is how I understood it is supposed to work?

Yes, this is the exact workflow. @nugaon why do we need tags for swarm-deferred-upload: false ? When we have swarm-deferred-upload: false the request will not return until the upload finishes, so you can see the progress on the way.

martinconic avatar May 30 '25 08:05 martinconic

I see, I'm always confused about this direct/deferred upload definitions, so it is intended as it works now... (and thanks @NoahMaizels for the clarification [again]). From DevUX perspective I would propose to throw bad request error when tagging id and direct upload is set together.

I thought maybe this upload mode should include that quite long period when the tag is not updated at deferred uploads. You can try out with uploading a bigger file and see console logs by building https://github.com/ethersphere/multichain/pull/17 So the chunk pushing takes relatively short time compared to the preceding task (chunking and storing?) - also I haven't seen stored value changing in any scenario, presumably that part does not work properly.

If so, I'll create a new one with the before mentioned problem.

nugaon avatar Jun 02 '25 08:06 nugaon

would like to work on this! Could you please assign it to me?

rose2221 avatar Jun 23 '25 16:06 rose2221

@rose2221 feel free to pick this issue and send a PR when you are ready

acha-bill avatar Jul 01 '25 15:07 acha-bill