mergerfs
mergerfs copied to clipboard
[Feature] On input/output error try next available mount
Hi, i have been using mergerfs in combination with rclone to merge a remote gdrive and acd. However due to googles very strict user rate limits i regularly receive "... exceeded ...." which in turn results in an "input/output Error" within mergerfs.
Would it be possible to get mergerfs to try the next available mount (which also has the same file) automatically? So if within for whatever reason the file is currently not readable in mountA mergerfs automatically tries the file in mountB?
I was already starting to look into this actually.
There could be an impact on general read performance given how libfuse works. I'm in the process of doing some benchmarks to tell for sure. In the worse I could make it an optional behavior.
It really shouldn't be much of a problem otherwise. Similar to the moveonenospc feature.
Great News :)
Also is there any way to check the origin of a file is "served" via the mergerfs mount at given time?
Background: I have some files available on a local HDD (sort of cache), in order to avoid having plex load a file from the clouddrives. I did set the "ff" policy, however I am not confident, that plex via the mergerfs mount actually reads from local HDD (if file is present) all the time.
If the file is open you can look in /proc/<mergerfs-pid>/fd/ and ls -lh to see all open files.
You can also use https://github.com/trapexit/mergerfs#file--directory-xattrs to see which would be picked up by the getattr policy. I've considered making it a bit more flexible so you could indicate which policy to use but it hasn't been a priority. I might be the only person to use that feature :)
brilliant... so with the directory-xattrs does the trick!
One sticking point. Files opened with certain flags need to have the flags disabled or the behavior just not apply. For instance: O_CREAT, O_TRUNC, O_EXCL.
Not sure it makes sense to fall back to a file being read if that file was also being written to. Probably dangerous. It'd be preferred that only files open'ed readonly would be eligible for this behavior but I have the feeling files might be opened read/write which aren't generally written to.
Well I don't understand all of it :)
However as for Plex it's just read only however up to now I haven't mounted the respective shares as RO. Probably time to get that done.
As far as I got my head around mergerfs it is not possible to state within mergerfs that 2 out of three mounts should be RO? So this means that the initial single mounts need to be mounted as RO, right?
I can't replicate entirely the real behavior of making something read only. At best I could fake it (or just ignore tagged drives when creating things) to a degree. If you don't want writes to a drive through mergerfs (currently) then using ro mount option for the device is best. Or to do a bind mount which is read only and add that to mergerfs instead.
Yeah, that's what I thought. So RO on mount it is :)
As for the discussed feature... i.e. a mounted gdrive via rclone mount will list a file without any problem. However as soon a one tries to read it (actually look into the file) you get a 403 Forbidden (if you managed to trigger any of the countless überliegst imposed by google) which in turn generates the input output error.
Are you aware of any way to replicate rclone's error condition without needing to actually exceed limits?
Unfortuantely no. This is something for @ncw
Maybe its possible recreate this "issue" (failed input/output) with some kind of testfile?
Please try the read-failover branch. It will search for another file of the same name should a read return EIO or ENOTCONN (what happens when a FUSE fs exits). If no files are found it will return EIO.
The only issue is that multiple reads can be in flux when the error occurs and so multiple threads will receive EIO. I have to lock the failover behavior and to keep it simple I don't try to dedup the errors so it will actually reopen the file in each thread. I can probably make it better but for now please try this.
Great!
... just installed it, will need to leave it running for a few days. I set gdrive as the first mount so plex should take care of exceeding the limits soon :) and then hopefully adrive should step in with this version of mergerfs.
BTW... still not sure about what happens when rclone is in the failed state. When it occurs please check if files are still visible and what happens if you try to stat or open it. Cause if they still show up then future opens may fail and I'd have to do similar failover behavior on "open" but I'm not so sure I like that.
Any updates?
Sorry if this has been answer before. But with this branch, this there any settings you have to specify during mount to enable it?
My GDrive is currently in the ban state. I used the following mount settings to test:
mergerfs -o ro,direct_io,defaults,allow_other,minfreespace=1G,fsname=mergerfs /mnt/gdrive:/mnt/acd /mnt/mergerfs
I have a file called "test.mkv" on saved on both root folder of gdrive and amazon cloud drive. I used ffprobe to check if this file's codec info can be read:
# cd /mnt/mergerfs
# ffprobe -v error -show_format test.mkv
# test.mkv: Input/output error
The correct output should looking like this (when I'm accessing Amazon Cloud Drive directly):
# ffprobe -v error -show_format /mnt/acd/test.mkv
[FORMAT]
filename=/data/acd/test.mkv
nb_streams=1
nb_programs=0
format_name=matroska,webm
format_long_name=Matroska / WebM
start_time=0.000000
duration=30.030000
size=74754816
bit_rate=19914702
probe_score=100
TAG:encoder=libebml v1.2.0 + libmatroska v1.1.0
TAG:creation_time=2016-02-06 04:00:52
[/FORMAT]
So with this branch, it doesn't seem to work for me and not failing over to amazon cloud drive..
@trapexit I can confirm when rclone is in the fail/ban state, files are still visible, you just get "input/output" errors.
No, it's always enabled.
The branch makes it retry reads. If it doesn't get to read it won't fail over. It's probably failing at open. The policy you're using is apparently picking the failed mount. Doing a similar failover behavior on open is possible but It's starting to get more and more invasive and special cased. It's be better if rclone could just make the files disappear.
I'm using the default policy (category.search=ff). I've also used "rand" and "eprand" without success. Is there a policy that may work?
ff is based on the ordering you provide. If google is first it will always be hit. rand should work on occasion.
The rclone behavior is not common in that it shows files available even when they are inaccessible. If it returned errors when scanning for the file they'd be skipped. I'm not sure it's totally valid what rclone is doing but it's not a common situation so hard to say.
I've ways I could deal with this but I'll need to refactor a lot of things.
Hi, any update about this feature?
https://github.com/trapexit/mergerfs/tree/read-failover2
I've been waiting for people to confirm it works before moving forward with finishing it off.
Failover will only work for reading? I believe that for open file operations it is not viable. But creating new files is feasible.
What's the specific usecase? Read failover is explicitly designed and intended for unstable remote filesystems such as those with quotas where particular, known errors are returned.
My scenario is special. I use a Raspberry Pi to record videos. My project has two or more "disks" (actually an SD and an SSD, all via USB, in the last case I use the Raspberry SD) for recording videos. At first, I save the video in RAM and then move the data to the disk (first to the SSD). However, I have several problems with the SDD stop responding unexpectedly (in most cases the SATA-USB controller fails, it could be a kernel / driver failure as well). In this case, I tested mergerfs so that when the SSD fails, write directly to the SD. However, when the first fails, it returns Input / Output Error. I can make it write to the SD (second branch) only when I put the SSD as RO.
Ah. It's certainly possible. The problem is that every usecase is different in terms of the way software interacts with the filesystem and therefore which functions need to be able to have this failover behavior as well as the errors themselves in question. There are lots of errors that are totally legit and should not be "failed over" for. And in the case of 'read' for instance it's not just 'read' that needs to have this behavior since a failed read can result in a successful short read and then the kernel will issue a 'fstat' so that too has to handle the situation.