gun icon indicating copy to clipboard operation
gun copied to clipboard

radisk file hashing

Open sirpy opened this issue 5 years ago • 4 comments

because of SEA file paths that include pubkey can be very long. radisk uses filenames as lexical index based on paths which doesnt work with long paths (>255 bytes on linux).

this solution solves it by:

  • save file as sha256(filename)

  • get file as sha256(filename)

  • keep in-memory index of used filenames ie index[hash] = true

  • keep on-disk list of original filenames ([dbname].idx)

  • whenever we get a put request we check the in-memory index if we already have this filename if we do proceed as usual if not we append the filename to the on-disk index

  • on initialization step (ie store.list) we read the .idx file instead of iterating over directory content for filenames, so rad has all the index information it needs

sirpy avatar Aug 31 '20 08:08 sirpy

w00h00! 👏

%C (something like that) is the "folder" file, maybe we can use/merge this with the idx file? So for instance, having the .idx file in the same folder could lead to collision (why I save tmps outside)

amark avatar Sep 01 '20 07:09 amark

@amark well if you can explain the structure of %C, i didnt see it was used anywhere, so maybe we can use it as idx. but %C is saved inside the directory, so why would saving the .idx there would lead to collision? it might break if you have multiple instances running with same "opt.file" but that would break radisk anyways no?

sirpy avatar Sep 01 '20 08:09 sirpy

@amark this is important

sirpy avatar Nov 25 '20 06:11 sirpy

Well, I tried writing my own fix for the fun of it, but Gun internals are still over my head. I humbly bump this PR, as a solution to https://github.com/amark/gun/issues/1070

nsreed avatar May 01 '21 23:05 nsreed