borg icon indicating copy to clipboard operation
borg copied to clipboard

buzhash64: deterministically create a balanced bh table

Open ThomasWaldmann opened this issue 6 months ago • 2 comments

the previous approach had cryptographic strength randomness, but a precise 50:50 0/1 bit distribution per bit position in the table was not assured.

now this is always the case due to the way how the table is constructed.

ThomasWaldmann avatar Jun 15 '25 09:06 ThomasWaldmann

Codecov Report

Attention: Patch coverage is 20.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 82.07%. Comparing base (2924fc5) to head (3617b63). Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
src/borg/archiver/benchmark_cmd.py 0.00% 4 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8924      +/-   ##
==========================================
+ Coverage   82.04%   82.07%   +0.02%     
==========================================
  Files          77       77              
  Lines       13491    13490       -1     
  Branches     1996     1996              
==========================================
+ Hits        11069    11072       +3     
+ Misses       1757     1753       -4     
  Partials      665      665              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 15 '25 10:06 codecov[bot]

Hmm, is random.Random (Mersenne Twister) seeded with 256bit key derived from 256bit secret good enough for this or do we need a CSPRNG?

Considering that the secrets stdlib module does not offer seeding we would need to use AES-CTR or so for this.

Update: created own CSPRNG based on AES256-CTR. Not only because of it being cryptographically strong, but also because the seeding for random.Random sounded a bit like it could change depending on the Python version (although they would "offer" a compatible seeding, but guess that might require code changes on our side then!?). For our chunker, we need something long-term stable and really deterministic.

ThomasWaldmann avatar Jun 15 '25 10:06 ThomasWaldmann