bee icon indicating copy to clipboard operation
bee copied to clipboard

uploads should be encrypted by default

Open istae opened this issue 1 year ago • 11 comments

main motivation atm for this is to improve batch utilization. we need numbers on how this improves the situation.

Things to measure:

  1. Measure batch utilization without encryption
  2. measure batch utilization uploading the same content but with encryption on

istae avatar May 24 '23 10:05 istae

What, really? This would be a breaking change for existing upload tools?

ldeffenb avatar May 24 '23 11:05 ldeffenb

Or are you talking about only the gateway and not the API?

ldeffenb avatar May 24 '23 11:05 ldeffenb

@ldeffenb what we are thinking is the encryption header should be on by default so that the node encrypts the uploaded data. any download endpoint can then be used with returned reference which also contains the encryption key as normal.

In what way do you think this will break the existing tools?

istae avatar May 24 '23 13:05 istae

My OSM uploader uploads the individual files and captures their references. It then externally builds the mantaray manifest referencing those file references. If the /bytes API defaults to encryption, then all of my following OSM updates (or at least the first update after the change) will end up getting unique (or new) references for the unchanged files that already exist within the swarm. Given that this dataset is millions and millions of files, the cost to the stamp will be substantial as compared to the current re-use of non-changed map tiles from one update to the next.

And I can't believe I'm the only one relying on the swarm reference to not change if the contents do not change when continuing to use the same API for uploading.

ldeffenb avatar May 24 '23 18:05 ldeffenb

Of course, it only requires the uploading tools to set the encryption header to false, but changing defaults like this needs to be well publicized.

ldeffenb avatar May 24 '23 18:05 ldeffenb

I think the default encryption is more inline with swarm's mission and yes, the only change for uploading tools is to set the header to false if they are incompatible with the encrypted references. yes, breaking changes do need to be well publicized.

istae avatar May 24 '23 20:05 istae

We need to make sure for the following:

  • Before getting started with this one, we need to get some benchmarks regarding the batch utilisation. We need to know how was before the implementation of the current one, and how it changed afterwards.
  • we communicate this to the community at least 2 weeks before release.
  • sync with Aron - he may have some encryption defaults on the beejs side.

nikipapadatou avatar Sep 11 '23 14:09 nikipapadatou

initial testing has revealed that encryption does not improve utilization https://hackmd.io/qZF11cycQzOFb0_YyUYUpw

istae avatar Sep 14 '23 12:09 istae

It would be interesting to see a histogram of the /stamps/{batchID}/buckets values for these 4 tests. https://docs.ethswarm.org/debug-api/#tag/Postage-Stamps/paths/~1stamps~1%7Bbatch_id%7D~1buckets/get

ldeffenb avatar Sep 14 '23 13:09 ldeffenb

This one is not expected to have any impact on batch utilisation, but for privacy reasons we need to proceed with this. In order to do so, we need the following:

  • Sync with the bee-js team so they make the relevant change on their side as well (encryprion by default to true)
  • Before we release this one, leave at least 10 days before @NoahMaizels know, so that he communicates it to the community (changes the size).

nikipapadatou avatar Sep 25 '23 10:09 nikipapadatou

  • (changes the size).

I presume you mean the size of the produced references since they'll now include the decryption key? That'll take a bit to get used to (even though I'll be explicitly flagging my uploads as non-encrypted).

ldeffenb avatar Sep 25 '23 14:09 ldeffenb