Compact postage batch usage with chunks optimization

Open tmm360 opened this issue 1 year ago • 0 comments

Summary

I propose a solution to encode chunks in order to reduce the minimum postage batch depth required to contain an amount of given data of any kind. In this way postage batches can provide real usable space near to the theoretical max volume, maximizing their usage.

Motivation

One of the first problem with Swarm today is about upload costs. Based on how data chunks are distributed into postage stamp buckets, the minimum required postage batch depth is determined.

At moment the swarm documentation offers a table of suggested minimum required depths, that has been statistically estimated with a success of 99.9% of times on random data (0.1% of failure). This approach is heavily under optimized, and doesn’t allow any depth lower than 22. https://docs.ethswarm.org/docs/concepts/incentives/postage-stamps/#effective-utilisation-table

With the new proposed approach, not only I’m going to calculate with precision the real minimum required postage batch, with 100% of success rate (0% of failure), but I’m also going to appreciate also lower depths until the minimum of 17, and I’m going to fit into it the maximum amount of chunked data possible, tending to the postage batch maximum theoretical volume limit.

Given an amount of data of any kind, is so possible to calculate with precision the lower required depth, to reduce costs, and to help usage incentivization by final users with any kind of application.

Implementation

My expertise with Golang development is very low, but I’ve already implemented this solution in C# with the library Bee.Net (https://github.com/Etherna/bee-net).

I’ve defined an additional parameter called “compactLevel” that in a range [0, 65535] defines the level of effort, or CPU power, to invest for chunks optimization. The default level 0 doesn’t apply any compaction, and its behavior is equal to the current implementation. Another level n instead will try to encode the chunk n times with different keys to help find the best fit in buckets with its hash. The max value 65535 has been chosen with the same size of the amount of postage stamp buckets.

The best bucket to find with this method is the first one with the lowest amount of already present hash collisions. Because of this, it follows that order of chunks insertion is relevant.

Another requirement of this algorithm is to have deterministic execution, in the sense of consistency of hashing between runs on the same data, with same compaction level and same starting status of postage buckets.

Because of different nature of mantaray manifest chunks and data chunks, the two kinds have been encoded in different ways:

Mantaray manifest chunks are optimized obfuscating each one with the best key. The best key is the 32B key that used to obfuscate the chunk will calculate the hash that fits into the best bucket for this chunk. This usage of obfuscation keys permits to keep compatibility with current implementation.
Data chunks are optimized encrypting each one with the best 32B key, similarly to before. The difference here is that obfuscation key is stored into the parent intermediate chunk, or into metadata of the manifest itself with the file entry. Intermediate chunks, as leaf data chunks, are encrypted with the same method recursively. Keys are stored into the 4kB of intermediate chunks sequencing at max 64 couples of (refHash, encKey) about child chunks. Intermediate chunk interpretation instructions are stored inside of manifest metadata. Unfortunately this breaks compatibility with current implementation, because introduce concept of recursive chunk encryption, but usage of metadata permits to be backwards compatible with existing chunks without recursive encryption.

Code to optimize data chunks can be found on the hashing pipeline at Bmt stage: https://github.com/Etherna/bee-net/blob/44ce8801675c075034fcb1d9f9decb1307574798/src/BeeNet.Core/Hashing/Pipeline/ChunkBmtPipelineStage.cs#L48

Code to optimize manifest chunks can be found on the hashing manifest implementation: https://github.com/Etherna/bee-net/blob/44ce8801675c075034fcb1d9f9decb1307574798/src/BeeNet.Core/Manifest/MantarayNode.cs#L153

Looking at code, starting from data chunks encoding. At beginning, the plain chunk hash is calculated. If compaction isn’t enabled, this result is used directly as hash calculation. Otherwise, if compactLevel > 0, the plain hash is used as seed to calculate the encryption key with method GetBestChunkAsync.

The method GetBestChunkAsync performs an optimistic search of the key. Optimistic approach has been adopted because as previously said the chunk’s insertion order is relevant, but hashing calculation is also performed with parallelism on different chunks. Because of this, later chunks could be found before of previous ones, and their hashes could collide on same buckets, invalidating previously found best buckets. This can be easily identified with a check and best bucket can be calculated again if required when inserting a chunk. Because in general case this kind of collisions probability is very low, an optimistic approach is optimal to keep good performance with parallelization.

The method GetBestChunkAsync uses a cache of results, to avoid in case of optimistic miss to recalculate also previous results. Then search the best key, and stops waiting a semaphore that will be released by previous chunk when inserted. When released by semaphore, only a single chunk can proceed at time, and necessarily in order. Collisions on selected bucket are verified if unchanged from optimistic case, and in case if necessary best chunk is recalculated.

Best chunk is searched with an iteration of an uint16 from 0 to compactLevel, then for each iteration the cache is searched, in case of previous result. In case of cache miss, a new key is generated from the plain hash, replacing last two bytes with the iterator, than hashing again. This leads to unique but also deterministic keys. Chunk’s data is encrypted with key, than chunk is hashed. Results with expected collisions are cached.

Verifying the result, the first one with the lowest collisions on bucket is the best. Buckets collection has been built in a way that efficiently keep trace also about minimum value of collisions between all buckets, and this permits to immediately stop with a result in the case that current collisions are already the lowest possible number. This can reduce tested cases.

When the best key and so the best fitting chunk has been found, the key is passed with the hash on the next stage of pipeline, and intermediate chunks are built as previously described. When the result from Sum() is returned to the manifest builder, it contains different information identified as SwarmChunkReference:

XorEncryptKey? EncryptionKey an optional chunk encryption key
SwarmHash Hash the chunk’s hash
bool UseRecursiveEncryption a boolean information to use or not recursive encryption

Manifest doesn’t need to report on metadata details about compaction levels. Only decoding information is needed. In this way, if EncryptionKey is found it’s saved with metadata label "chunkEncryptKey" into file’s entry, and if intermediate chunks needs to be interpreted with recursive encryption, the value true is saved in metadata with "recursiveEncrypt" label. If not found, default is false. ChunkJoiner and ChunkTraverser tools are updated accordingly to interpretate these information.

Mantaray manifest’s chunks are similar, but simpler, and keeps compatibility with existing code. They are processed sequentially, so algorithm doesn’t need an optimistic approach. The scope is to find the best obfuscation key. Obfuscation keys are generated starting from the hash of the chunk generated without obfuscation, replacing the last 2 bytes with the uint16 iterator, and hashing again. Remaining code is similar to the previous one.

Results

I’ve implemented a sample application that consume Bee.Net to provide hashing statistics: https://github.com/Etherna/bee-net-stats

The test is performed simulating the upload of a file with random data, different each time. Each test is performed 10 times, and average results are showed here. Data size is increased by 1MB, starting from 500MB (524288000 byte) to 504MB (528482304 byte). Different compact levels are tested. Generated chunks include data chunks, intermediate chunks and manifest chunks describing the file entry.

SourceFileSize	TotalChunks	CompactLevel	AvgDepth	AvgSeconds
500MB	129012	0	20	4.8616977
500MB	130036	1	20	9.8073598
500MB	130036	2	18.8	12.9936236
500MB	130036	5	18	21.4324366
500MB	130036	10	18	34.8184973
500MB	130036	20	18	41.680212
500MB	130036	50	18	46.7259444
500MB	130036	100	18	47.9376216
500MB	130036	200	18	49.2228616
500MB	130036	500	17.1	52.2383984
500MB	130036	1000	17	49.2834027
500MB	130036	2000	17	51.2397904
500MB	130036	5000	17	55.3805355
500MB	130036	10000	17	61.6001811
500MB	130036	20000	17	73.8819322
500MB	130036	50000	17	73.3438695
500MB	130036	65535	17	75.7571539
501MB	129270	0	20	4.7144738
501MB	130296	1	20	9.7485388
501MB	130296	2	18.8	13.0561349
501MB	130296	5	18	21.4855436
501MB	130296	10	18	34.4280584
501MB	130296	20	18	41.2054744
501MB	130296	50	18	45.8917271
501MB	130296	100	18	48.7465933
501MB	130296	200	18	53.0056445
501MB	130296	500	17.3	52.5092546
501MB	130296	1000	17	51.4568799
501MB	130296	2000	17	51.9211933
501MB	130296	5000	17	55.8439834
501MB	130296	10000	17	61.6595215
501MB	130296	20000	17	69.6877311
501MB	130296	50000	17	76.890554
501MB	130296	65535	17	74.891462
502MB	129528	0	20	4.715505
502MB	130556	1	20	9.8545345
502MB	130556	2	18.6	12.9786215
502MB	130556	5	18	21.4480894
502MB	130556	10	18	34.837008
502MB	130556	20	18	42.6157297
502MB	130556	50	18	46.8623895
502MB	130556	100	18	49.4535076
502MB	130556	200	18	50.7652596
502MB	130556	500	18	52.332985
502MB	130556	1000	17	54.1236994
502MB	130556	2000	17	53.686172
502MB	130556	5000	17	55.9355314
502MB	130556	10000	17	64.2138653
502MB	130556	20000	17	71.8987032
502MB	130556	50000	17	75.4131588
502MB	130556	65535	17	76.8356262
503MB	129786	0	20	4.7274726
503MB	130816	1	20	9.6885144
503MB	130816	2	18.7	12.9922294
503MB	130816	5	18	21.5604944
503MB	130816	10	18	34.7752492
503MB	130816	20	18	42.2068468
503MB	130816	50	18	46.1989665
503MB	130816	100	18	50.537823
503MB	130816	200	18	50.7851863
503MB	130816	500	18	55.5057441
503MB	130816	1000	17.8	53.0813187
503MB	130816	2000	17.1	56.1055961
503MB	130816	5000	17	57.0325347
503MB	130816	10000	17	66.1454971
503MB	130816	20000	17	69.6537728
503MB	130816	50000	17	71.6421064
503MB	130816	65535	17	93.2257601
504MB	130044	0	20	4.7980694
504MB	131076	1	20	9.97654
504MB	131076	2	18.7	13.0223564
504MB	131076	5	18	21.5428152
504MB	131076	10	18	34.7821753
504MB	131076	20	18	43.7320456
504MB	131076	50	18	47.0791892
504MB	131076	100	18	50.0852819
504MB	131076	200	18	53.4406342
504MB	131076	500	18	55.9797288
504MB	131076	1000	18	59.7013192
504MB	131076	2000	18	64.819459
504MB	131076	5000	18	72.2588303
504MB	131076	10000	18	89.2885439
504MB	131076	20000	18	111.2160627
504MB	131076	50000	18	120.6244225
504MB	131076	65535	18	122.1527205

Looking at results, we can see how on 500MB case by default the upload would require a postage stamp depth of 20, but already with compact level 2 it starts to reduce, and increasing level it gets better. From compact level 500 and above it starts to fit into depth 17 the majority of times, and this mean a price reduction of 2^(20-17) = 8 times from the best fit without compaction, or taking values suggested on documentation website a reduction of 2^(22-17) = 32 times.

We can see an increasing on hashing time, but because tested keys are stopped when a chunk with minimum collision value is found, its increasing is not linear with compact level, leading to the ability to also use higher compact levels, without linear CPU costs.

We can also see that when compaction level is greater than 0, chunks amount increases. This is due to the use of recursive encryption, that doubles the number of intermediate chunks. This means that, from 504MB the chunks with recursive encryptions are 131076, more by 4 than 2^17 = 131072, and this means that depth 17 can’t be fitted by 4 chunks. This is, in fact, the real data limit of this depth.

Drawbacks

Any data uploaded with compactionLevel > 0 can’t be read from clients without support of recursive encryption on data chunks
Uploads performed with chunk optimization needs exclusive access to buckets of a postage batch to provide deterministic results. Because optimization considers the buckets status with each chunk insertion, having concurrent uploads with the same buckets can alterate the final hash, invalidating the deterministic execution
Due to the requirement of predictability of hashes, SOCs cannot be optimized

Feb 14 '25 14:02 tmm360