cloudstorage
cloudstorage copied to clipboard
Read File if already in LocalCache and Cleaner
It would be nice to read file if it already exists in local-file cache, checking md5 to make sure it is the same md5. Use md5 as filename? Would require a couple of cleaner strategies, time-based as well as size-based.
This would be a big optimization for some access patterns. Ill take some time to think about the changes need for this.
Off hand I believe we'd have too:
- Which the store to use a UUID per store cachedir (not per object), to avoid issues with multiple stores on the same system trying to cleanup each others files. *Or we use filelocks but meh..
- Change OpenFile so it pulls the metadata first, then uses the metadata Hash to check for the presence of the cached file with the same Hash.
- Update the LastModified data for the cache file each time it's Opened?
- Keep a counter of Opens on the cache file, increment in OpenFile and decrement in obj.Close(). *Or use filelocks...
- Make a number of changes to https://github.com/lytics/cloudstorage/blob/master/cachecleaner.go to have it be smarter about what to clean up... Check openfile counts,
- Possibly : make the cache cleanup configurable with settings so the files can be cleaned up by LeastRecently used: MaxFileCount, MaxAge, MaxBytes... We could get even smarting and enforce a %free for the disk [Shrug].
This would be great, looks like it has been sitting a while.
As far as s3 goes we could utilize the ETag to verify if the backend file has changed.