acd_cli icon indicating copy to clipboard operation
acd_cli copied to clipboard

Local File Cache (On upload and download)

Open karbowiak opened this issue 9 years ago • 11 comments

Hi!

I was wondering if it would be possible to have a local file cache for the fuse mount, so that ..

  1. Files that were currently being uploaded, were written into that directory, and then uploaded - at which point the program that were doing the writing to the fuse mount, would be told it had completed (eventho it really hadn't)
  2. So that files that has been read from the fuse mount, are stored locally for a period of time (think LRU cache) and then thrown out once the cache size limit has been reached.

number 1 might not be liked by everyone, but number 2 should definitely be a fan favorite :) Both should help cover the speed issues with the FUSE mount atleast, and the 2nd one is just common sense i guess :P

karbowiak avatar Apr 18 '16 20:04 karbowiak

I agree that both would be valuable options that a user could enable/disable as needed.

dellipse avatar Apr 18 '16 20:04 dellipse

+1

ctejada10 avatar Apr 29 '16 03:04 ctejada10

This sounds like something that could be achieved with aufs and a crontab script to initiate an upload every so often with the "--remove-source-file" option enabled.

EldonMcGuinness avatar Apr 29 '16 03:04 EldonMcGuinness

Honestly, I am more interested in a download cache, rather than upload. I use my ACD for video storing, and it bothers me a little to wait for the file to be "ready" for streaming when I've already streamed that file recently.

So I was thinking on saving the actual contents of the file somewhere and delete it every X amount of days.

ctejada10 avatar Apr 29 '16 03:04 ctejada10

I guess what I said would be addressing the first option, but I guess I could see the need for the second option. Though, for people that edit files in more than one place and are relying on the data being in sync on the cloud the second option would be a no-no, unless acdcli did a hash check against the server to see if there is a new version each time the cached file is used and then updated the cache as needed before opening the file for I/O.

Though I guess the cache could be optional or allowed to be set to 0 to make sure that only streaming is done.

EldonMcGuinness avatar Apr 29 '16 03:04 EldonMcGuinness

Yes, of course. This should be an optional feature that the user can enable or disable at will. My use case is very particular and I wouldn't expect to force it upon the other users.

ctejada10 avatar Apr 29 '16 03:04 ctejada10

ctejada10, i achieve this local "cache" by creating a union-fuse between a local folder and an acd folder. I upload nightly, and delete any file older than 14 days.

union-fuse: unionfs-fuse -o cow /home/msc/local-sorted=RW:/home/msc/acd-sorted=RO /home/msc/sorted/

nightly upload: acd_cli upload -x 2 -r 5 /home/msc/.local-sorted/* / find /home/msc/.local-sorted/ -type f -mtime +15 -exec rm -rf {} ;

Most of these idea's came from an article over at amc.ovh

But I agree, having an on-demand\dynamic cache would be more efficient.

endiz avatar May 04 '16 16:05 endiz

Hey endiz,

I'm one of the collaborators of that blog. Glad to see it caught on!

The problem at hand, sadly, is not solved by the union mount. I am referring to when I stream a media file right off the ACD mount, that has been already deleted from my local storage and then, for some reason, I stop the stream. When I am to restart the stream I find it wasteful to re-download the file from ACD again. It'll be really cool if the ACD mount could appropriate a couple gigs as cache storage for files that have been recently fetched so I don't need to waste bandwidth and time re-downloading them again.

ctejada10 avatar May 04 '16 17:05 ctejada10

This would fix #376 and #185.

This is what i.e. OpenStack SWIFT FUSE driver does for writing files https://github.com/redbo/cloudfuse

Related code: https://github.com/redbo/cloudfuse/blob/master/cloudfuse.c#L256-L289

Thinkscape avatar Aug 15 '16 19:08 Thinkscape

@endiz I've found a serious limitation of the unionfs approach.

For deletions, both branches (local path and ACD) have to be set to RW. The problem is, for files within folders, that have already been "synced" into lower branch (ACD), union will write directly into ACD. This is simply because it works on a path-basis and will notice that the path is stored in ACD, so any writes will be performed there (new files, changes to existing files, new subdirs etc.).

Thinkscape avatar Aug 15 '16 20:08 Thinkscape

Any traction on this or solution?

If find the other issue is currently Amazon Cloud Storage is having issues and frequently stops working, but will start working again in seconds to minutes. So if a file was cached it would help with that issue

Crenor avatar Apr 26 '17 21:04 Crenor