Support eviction based on LRU
Assuming user wants to have 10k files per folder In a very large system There are 10 SSDs 4 TB each on the server mounted on
/disk1, /disk2.../disk10
User has configured total of 1k directories (100 directory per disk) to be used for cacheing /data0 to /data9 on disk 1 /data10 to /data19 on disk 2 ... /data90 to /data99 on disk 10 Assuming block size is configured to 4 mb and with even distribution this system can hold total of 10 Million files ( 40TB/ 4 MB ) and about 10k files per folder
User has also configured max disk usage to 90% and per folder eviction percentage as 4% max.disk.usage.percentage-> When the eviction is triggered folder.max.evection.percentage--> How much disk need to freed per folder when eviction is triggered. We do not want to cross the folder boundary as this may require listing and sorting by access time
Before writing a new file on to a given folder say /data7 extension should check the filesystem on which the directory is mounted
If the (old usage + new file) is more than (max.disk.usage.percentage X disk_capacity) then keep deleting the earliest files which will be about (folder.max.evection.percentage X total_folder_storage )
This will require listing and sorting the file insider the folder but since we have distributed the files across multiple folder this should be relatively fast ( about 10k files per folder)
We also need to make is best effort which means before reading the process should acquire read lock of the file and while deleting try delete and failure should be ignored
Sorry for the delay, I implemented a basic on-disk LRU for single process here: https://github.com/dentiny/duck-read-cache-fs/pull/245 Let me know if you need it work for multi-process, thanks!