Zygo
Zygo
That's...interesting. bees doesn't do range checks on hash values, and I can't think of anything else that would depend on a match between the contents of a hash table page...
I see `block_read=6724` which will touch most of the extents in an 8192-extent hash table. Each extent is 128K, so in theory reading just 32MB of data will touch the...
bees excludes beeshash.dat from scanning (it prevents the file from being opened at all by either scan or dedupe). You can also place BEESHOME on another filesystem (doesn't have to...
There is kind of a side-bug here: currently we flush `beescrawl.dat` on 15-minute intervals and `beeshash.dat` on 4GB-per-hour intervals (so for a 1GB hash table those are 15-minute intervals too),...
This looks similar to #199, but there are enough files here that we should be able to avoid extent insertion collisions that often. Questions: how big is the hash table,...
A filesystem with 400 GB of data (uncompressed size) should have at hash table sized between 40MB and 400MB (see the sizing chart in https://zygo.github.io/bees/config.html). A 128K hash file has...
> its (auto-created) hash file is too small? [picture of hash table histogram showing 8192 entries = 128K, the minimum size] Also...isn't the default hash table size 1G? In `beesd.in`...
Oops...a commit in 2016 (6fa8de660b9850640e1213791020e82a9d170af9) will auto-create a hash table if `.beeshome` or `$BEESHOME` already exist, but there's no way to specify a size yet, so it picks the minimum...
Point `$BEESHOME` to somewhere on the SSD for hash table and crawl position storage (use an absolute path). `$BEESHOME` does not need to be on btrfs. Also mount the target...
> polling the btrfs transaction ID should be cached and not being registered as I/O, correct? That's the theory, but I haven't tried it with real hardware. Let us know...