rclone
rclone copied to clipboard
vfs: add ability to exclude files from being uploaded (eg for temporary files)
Issue 1: rclone uploads ~partial files (which are not ready yet) and I want rclone mount to "ignore" these files and NOT uploading them to the cloud Issue 2: I want rclone to ignore whole directories to NOT upload to it's mount. ie, .grab directories that Plex DVR creates.
I've tried --exclude and couldn't get it to work as expect (or at all) rclone mount does not respect exclude parameters
Thanks!
Are you using the cache with the mount?
I don't have this issue as I have a ten minute delay timer on the cache upload, on top of that, you write the partials locally to a cache and it won't upload or even queue the upload until the file is unlocked (read not being written to anymore). When it's usually unlocked it's moved/renamed anyway prior to upload.
Still having the problems, basically Plex writes to the directory .grab until it’s finished with the DVR, then moves it to the correct place. I simple don’t want rclone to upload if anything is in .grab
On 4 Mar 2018, at 6:06 pm, Daniel Loader [email protected] wrote:
@daniel-loader do you have the delay for vfs-cache? What's the trigger for this?
This isn't a vfs cache, but the actual cache remote that wraps the Google drive remote.
--exclude works for me
$ rclone mount /tmp/big /tmp/mnt/
$ ls /tmp/mnt/
100M 1G directory new-file new-file3 new-file5 potato3.txt
120M 200M hello.txt new-file2 new-file4 potato2.txt potato.txt
$ rclone mount --exclude '*new*' /tmp/big /tmp/mnt/
$ ls /tmp/mnt/
100M 1G directory potato2.txt potato.txt
120M 200M hello.txt potato3.txt
What are you trying?
I think I may know what the OP is trying and I might be having the same issue.
I have a gdrive->cache->crypt
remote mounted using rclone mount
like this:
rclone mount gdrive-cache-crypt: /home/user/GDrive --exclude="*.tmp" --allow-other --cache-dir="/home/user/.cache/rclone/vfs-cache" --vfs-cache-poll-interval="1h" --vfs-cache-mode="writes" --vfs-cache-max-age="72h"
find /home/user/GDrive -name "*.tmp"
finds nothing. So, --exclude
is working for the rclone mount
command. But .tmp
files are still being uploaded to the remote.
Same result using backend cache only, like this:
rclone mount gdrive-cache-crypt: /home/user/Gdrive --allow-other --exclude=*.tmp --cache-tmp-upload-path=/home/user/.cache/rclone/tmp-cache --cache-tmp-wait-time=5m
How do I tell rclone
that those files must be ignored and not uploaded?
Thanks!
@sbono - thanks for explaining - I see the problem now.
--exclude
works on the listings that mount uses to see the remote files.
However nothing stops you uploading an filename that is excluded.
This probably requires fixing in two places
- the vfs layer so excluded file names are never uploaded, they are just kept in the local cache
- the cache backend, so it does the same thing.
Or maybe it could be fixed just in the VFS layer... This would require rclone ignoring direct uploads of files which had the wrong names
How's this issue looking? Any sort of workaround to prevent the vfs rclone mount from automatically uploading partial~ files? Thanks! :)
This wouldn't be too tricky to implement - does anyone fancy having a go with help from me?
Hello,
Has there been any progress with this or is there any other alternative to have rclone mount keep certain files on the local disk only?
I think the way to implement this would be to have a flag (or perhaps a multiple of flags) maybe --vfs-upload-exclude glob
where glob is as in the filtering rules, eg *~
If a file matched this then the VFS would not upload it ever though it would download them and keep them in the VFS cache.
How does that sound?
I'd think the best route to keep it simple is the follow the other filtering flags or even a simple flag to apply the same filtering to vfs upload?
Of course my method of "keep it simple" may apply more to the user than the code so maybe your method is best. But if not too bad, maybe include the existing filter flag to apply to vfs upload also? Maybe a -vfs-upload-filter that just enables that same filter and set default to Y?
I'd think the best route to keep it simple is the follow the other filtering flags or even a simple flag to apply the same filtering to vfs upload?
So that would mean using the existing filter commands.
Let's say you were using --exclude *.tmp
. This means that if there are any .tmp files on the remote they will not appear in the mount (this works now). With this extension, we would filter uploads to the mount and not upload anything with that extension - that seems straight forward too.
However the potential problem is what happens when a directory goes out of the directory cache. Let's say you made a file.tmp
- this is visible in the mount only because it was created locally. When the directory cache disappears then this file will disappear, and since it isn't on the remote cloud storage it won't re-appear when the directory is re-read.
Maybe this is acceptable? (The --vfs-upload-filter
flag has this problem too.) Or maybe this needs a bit more logic to unify what is in the cache with what is read from the remote when the directory is re-read after being dropped from the cache.
Another thought, what would happen if a file that is filtered from being uploaded is renamed to something that is not filtered? Would that be uploaded after the rename?
And what would happen in the reverse scenario? Something that is not filtered is renamed to something that is filtered?
Another thought, what would happen if a file that is filtered from being uploaded is renamed to something that is not filtered? Would that be uploaded after the rename?
Hmm, that is another corner case. At the moment the .tmp file gets uploaded, then renamed on the cloud storage. This would need another code path.
And what would happen in the reverse scenario? Something that is not filtered is renamed to something that is filtered?
...then there would be a file on the remote that we need to delete.
There are a lot of corner cases here :-(
The specific issue with partials being excluded until they are renamed can probably be solved by introducing a delay before the upload as discussed in #3186.
It won't solve the actual issue but it should satisfy everyone who has shown interest regarding that.
Any update here? Would be a really handy feature to be able to ignore files to upload
The specific issue with partials being excluded until they are renamed can probably be solved by introducing a delay before the upload as discussed in #3186.
This will be in the VFS revision which will go into 1.53 hopefully
Is there a way to exclude folders with the new VFS revision, @ncw ?
The specific issue with partials being excluded until they are renamed can probably be solved by introducing a delay before the upload as discussed in #3186.
This will be in the VFS revision which will go into 1.53 hopefully
This did go into 1.53 as the --vfs-writeback-delay parameter
Is there a way to exclude folders with the new VFS revision
You can use the filter commands on a mount if you want to exclude a directory, however this may not do what you want (see above).
The specific issue with partials being excluded until they are renamed can probably be solved by introducing a delay before the upload as discussed in #3186.
This will be in the VFS revision which will go into 1.53 hopefully
This did go into 1.53 as the --vfs-writeback-delay parameter
Is there a way to exclude folders with the new VFS revision
You can use the filter commands on a mount if you want to exclude a directory, however this may not do what you want (see above).
Yes but say we didn't want a file to LEAVE cache based on a regex (say it's being worked on by a piece of software).
I have the cache set on an SSD and I constantly see log messages about the file queuing for upload in XX minutes everytime I modify it, and I'd like to just ignore any files based on a regex so I don't constantly get those verbose messages.
Another thought, what would happen if a file that is filtered from being uploaded is renamed to something that is not filtered? Would that be uploaded after the rename?
Hmm, that is another corner case. At the moment the .tmp file gets uploaded, then renamed on the cloud storage. This would need another code path.
And what would happen in the reverse scenario? Something that is not filtered is renamed to something that is filtered?
...then there would be a file on the remote that we need to delete.
There are a lot of corner cases here :-(
How about offloading the logic into a separate backend? They do not have to be a backend, but the concepts might help reduce undefined behaviors. For different use cases:
- A
permission
backend with atransparency
option defining the behavior when writing to a file without permission. The options could beerror
,persist
, andblackhole
. Thepersist
option means keeping the file in the vfs cache, while theblackhole
means dumping the data silently.(There might not be much use cases forblackhole
but added for completeness.)- This is tricky for the vfs lifecycle. Should rclone dump the cache for
persist
after a restart? - Possibly separate the read filtering and the write filtering to reuse the filtering syntax.
- This is tricky for the vfs lifecycle. Should rclone dump the cache for
- A
diverge
backend that is similar to theunion
, but allow complex policies based on filtering. We can manually pick a cache location(even:memory:
:smirk:) and manage the lifecycle. It's probably sufficient to support merging two remotes only.- I don't know, but does vfs mount support direct write to local? If not, the files might be moving around all the time.
- A
pin
backend that keeps some of the files available offline, and some of them never uploaded to the server. Some programs works as long as the files do not leave the cache, and does not mind updating the files to the remote from time to time, e.g. backup programs.- This effectively implements a selective sync client.
It may seem that these backends are bringing back the cache
remote - IMHO it's not. Cache is elementary for all writable mounts, but these backends provide some extra functionalities.
This feature would be really useful. Some kind of exclude patterns for folders and files. I already searched for many workarounds but none of them are working. E.g. ich tested mergerfs in front of host directory and rclone mount to split out working directories (to prevent uploading folders to a rclone remote). But this won't work since most software uses "rename" feature of filesystem and this triggers an EXDEV error since you can't rename files over two different devices/filesystems. So for example if you have a software that stores files in /data/files and has a /data/tmp directory where it puts current running uploads and renames them when finished to /data/files/uploadedFile.ending this won't work.
I would be really glad if rclone could get an exclude-from-upload filter feature and also a feature to let VFS cache some directories or files that match a pattern infinite.
So it would be possible to say: Don't upload /data/tmp and cache /data/tmp infinite on my harddisk (to prevent deletion while cache cleanup).
For us, the feature would also be very important to exclude files that change too often or are only temporary.
Would anyone like to work on this feature - happy to talk it through?
Or alternatively maybe one of your companies would like to sponsor me to implement it?
See a lot of errors from Dropbox temp files like:
2021/07/28 00:07:25 ERROR : .~lock.features.odt#: Failed to copy: upload failed: batch upload failed: path/disallowed_name
I would really appreciate this feature
any news?
any news?
Got a plan yet? I'm currently recording in Typora and each time I save it, a temporary file is created and the temporary file is uploaded to the webdav server before being deleted, which doesn't seem very efficient, so it would be great if a filter could be added.

+1
I'm just coming to this, but I would find it useful to be able to exclude temporary files by a pattern to avoid them being uploaded and wasting ops/bandwidth.