spacedrive icon indicating copy to clipboard operation
spacedrive copied to clipboard

[ENG-708] Thumbnail sharding

Open jamiepine opened this issue 2 years ago • 2 comments

This is a simple PR to improve the way we store thumbnails in the data folder. After a nice chat with ChatGPT I discovered why most apps do this, I noticed even Apple apps store files internally in hex-coded folders to prevent directory sizes from growing too large. Since my thumbnail folder is well over 85,000 items and takes a minute to open in Finder it could probably become an issue faster than we'd expect. This is an industry standard format for storing large collections of cached files.

calc_shard_hex takes a cas_id as input, computes a blake3 hash of the filename, and returns the first two characters of the hash as the directory name. Because we're using the first two characters of a blake3 hash, this will give us 256 (16*16) possible directories, named 00 to ff.

jamiepine avatar Jun 07 '23 04:06 jamiepine

The latest updates on your projects. Learn more about Vercel for Git ↗︎

2 Ignored Deployments
Name Status Preview Comments Updated (UTC)
spacedrive-landing ⬜️ Ignored (Inspect) Visit Preview Jun 8, 2023 7:11am
spacedrive-web ⬜️ Ignored (Inspect) Visit Preview Jun 8, 2023 7:11am

vercel[bot] avatar Jun 07 '23 04:06 vercel[bot]

What is the purpose of hashing the cas_id, isn't the cas_id already a hash?

oscartbeaumont avatar Jun 07 '23 20:06 oscartbeaumont

What is the purpose of hashing the cas_id, isn't the cas_id already a hash?

You are so correct, it is indeed already hexadecimal. Fixed!

jamiepine avatar Jun 08 '23 05:06 jamiepine