fileconveyor
fileconveyor copied to clipboard
FileConveyor's inotify FSMonitor breaks when it finds temporary files in it's path.
I'm having issues with fileconveyor getting hung up on vim temporary files. I know I could exclude these from my paths, but ideally the daemon would not crash when a file goes away between the time it finds it and attempts to do something with it.
Better error handling needs to exist, so that the daemon never dies. It this involves wrapping the entire while(1) in a try catch and spitting out errors then restarting, then so be it.
It should also provide useful notification (perhaps via email) that it encountered a problem. Because currently when the server gets hung....essentially your CDN will start to fall behind and your site could end up being totally fubar'd.
I just ran into this again. It happens when I'm editing a with vim under the file conveyor path
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.6/threading.py", line 532, in **bootstrap_inner
self.run()
File "/usr/lib/pymodules/python2.6/pyinotify.py", line 1356, in run
self.loop()
File "/usr/lib/pymodules/python2.6/pyinotify.py", line 1342, in loop
self.process_events()
File "/usr/lib/pymodules/python2.6/pyinotify.py", line 1134, in process_events
watch_.proc_fun(revent) # user processings
File "/usr/lib/pymodules/python2.6/pyinotify.py", line 811, in __call**
return _ProcessEvent.__call__(self, event)
File "/usr/lib/pymodules/python2.6/pyinotify.py", line 544, in **call**
return meth(event)
File "/home/www/domain.tld/fileconveyor/wimleers-fileconveyor-d1c55b8/code/fsmonitor_inotify.py", line 233, in process_IN_CREATE
self.__update_pathscanner_db(event.pathname, FSMonitor.CREATED)
File "/home/www/domain.tld/fileconveyor/wimleers-fileconveyor-d1c55b8/code/fsmonitor_inotify.py", line 205, in __update_pathscanner_db
st = os.stat(pathname)
OSError: [Errno 2] No such file or directory: '/home/www/domain.tld/htdocs/sites/all/themes/theme_zen/js/.functions.js.swx'
It should most definitely not crash. It should simply detect the deletion and be done with it.
To test, open fileconveyor under a path.
Inside that path, open up a file with vim. Save and close the file.
I stopped using fileconveyor because it was much too buggy when I was playing around with it for use on a live server. Perhaps this has been fixed by now, but in my past experience this process I've mentioned was causing a problem.
I just tested it and I cannot get it to crash:
2012-01-20 07:36:42,865 - Arbitrator - WARNING - Synced: '/Users/wimleers/Desktop/watch/.bar.swp' (CREATED).
2012-01-20 07:36:43,271 - Arbitrator - WARNING - Synced: '/Users/wimleers/Desktop/watch/bar' (CREATED).
2012-01-20 07:36:48,290 - Arbitrator - WARNING - Synced: '/Users/wimleers/Desktop/watch/.bar.swp' (DELETED).
2012-01-20 07:36:49,307 - Arbitrator - WARNING - Synced: '/Users/wimleers/Desktop/watch/bar' (DELETED).
2012-01-20 07:37:03,937 - Arbitrator - WARNING - Synced: '/Users/wimleers/Desktop/watch/.baz.swp' (CREATED).
2012-01-20 07:37:07,363 - Arbitrator - WARNING - Synced: '/Users/wimleers/Desktop/watch/.baz.swp' (DELETED).
2012-01-20 07:37:07,369 - Arbitrator - WARNING - Synced: '/Users/wimleers/Desktop/watch/baz' (CREATED).
Either it has been fixed or I cannot reproduce it on OS X.
If you use pip
, you can now easily install and uninstall File Conveyor and its dependencies, by doing this:
pip install -e git+https://github.com/wimleers/fileconveyor@master#egg=fileconveyor
Test with something like this: https://gist.github.com/1647418. Then you should be able to either reproduce the problem easily or (hopefully) not.
Any news, @j0rd?
No, I stopped using fileconveyor a long time ago, sorry.
Np :)
Just discovered yesterday then rsyncing into a folder 'watched' by fileconveyor causes the problem mentioned above: temporary files cause FileConveyor to crash.
@ericssmith Is this with 0.3-dev? Temporary files should work just fine. Also see what I said a few comments higher, "I just tested it […]".
Just to let you know, I got this issue with vim on an RHEL box too. Was some time ago (3 months ish) but still relevant I think.
We are going to repeat the failing procedure and post the results here today.
Unfortunately, we don't have time to recreate the error, but here is what the log looked like:
2012-10-30 15:25:01,953 - Arbitrator.FSMonitor - INFO - Detected 'CREATED' event for '/var/www/vhosts/mainsite/sites/default/files/.pumpkin_bread.jpg.M2B4iO' through inotify (for monitored path '/var/www/vhosts/mainsite/sites/default'). 2012-10-30 15:25:02,151 - Arbitrator - INFO - Discover queue -> pipeline queue: '/var/www/vhosts/mainsite/sites/default/files/.pumpkin_bread.jpg.M2B4iO'. 2012-10-30 15:25:02,161 - Arbitrator - INFO - Pipeline queue -> filter queue: '/var/www/vhosts/mainsite/sites/default/files/.pumpkin_bread.jpg.M2B4iO'. 2012-10-30 15:25:02,164 - Arbitrator - INFO - Filtering: dropped '/var/www/vhosts/mainsite/sites/default/files/.pumpkin_bread.jpg.M2B4iO' because it no longer exists.
This is where it died. The error itself is not in the log. Sorry that we won't be able to reproduce it. but will report it if we run into it again. However we are going to set the temp files for rsync to not be in the receiving directory, so that seems unlikely.
(and thanks for your work on this. It goes braindead occasionally, but otherwise works as advertised)
Hmm … that's very odd. That means it actually died just after dropping a file (i.e. dropping it from the queue of files to deal with). Very odd.
Did you try setting File Conveyor's logging level to DEBUG
? That should help.
I know that it's imperfect, and not as reliable as it should be, but with every edge case that I close, we're getting closer to stability :)
To duplicate the bug. I'd suggest making a small script which creates and deletes random files then unleash it under a fileconveyor path.
If it dies, you've found my bug.
I actually did just that but couldn't get it to crash at all.
Hello there! I have reproduced this error, but I think it happens when there is a temporary file at one time, but then that temporary file is gone when it attempts to transfer the file.
2012-11-20 14:17:46,826 - Arbitrator - WARNING - Synced: '/var/www/vhosts/xxxxx/sites/default/files/IMG_9738-1.JPG' (DELETED). Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 532, in bootstrap_inner self.run() File "/usr/lib/python2.6/site-packages/pyinotify.py", line 1410, in run self.loop() File "/usr/lib/python2.6/site-packages/pyinotify.py", line 1396, in loop self.process_events() File "/usr/lib/python2.6/site-packages/pyinotify.py", line 1195, in process_events watch_.proc_fun(revent) # user processings File "/usr/lib/python2.6/site-packages/pyinotify.py", line 840, in __call return _ProcessEvent.call(self, event) File "/usr/lib/python2.6/site-packages/pyinotify.py", line 571, in call return meth(event) File "/usr/local/bin/fileconveyor/fsmonitor_inotify.py", line 253, in process_IN_MODIFY self.__update_pathscanner_db(event.pathname, FSMonitor.MODIFIED) File "/usr/local/bin/fileconveyor/fsmonitor_inotify.py", line 205, in __update_pathscanner_db st = os.stat(pathname) OSError: [Errno 2] No such file or directory: '/var/www/vhosts/xxxxx/sites/default/files/.2012-11-12 03.32.06.jpg.PnN0nx'
This happens very intermittently when a file is RSYNC-ed into the path that fileconveyor is monitoring, when that file is gone when it gets to transfering it or something like that.
I have RSYNCed many files into the path and it then picks them up and transfers them fine, it seems to be only this rareish condition that shuts down fileconveyor.
Aha! Now that is interesting :) Maybe it's happening on a system with a relatively workload, meaning that the inotify events are somewhat lagging behind, in which case this could indeed happen. FSMonitorInotifyProcessEvent.__update_pathscanner_db()
should be checking whether the file still exists (os.path.exists(input_file)
), much like Arbitrator.__process_filter_queue()
does.
Could you make that simple change and check whether that fixes the problem?
Hello, it has been a while but I wanted to update you. It turned out that even if I checked if the file exists with os.path.exists(pathname) right before the line os.stat(pathname), it would still happen that a temporary file could be deleted after it had gone past the check but before it got to os.stat, causing an error.
So what I ended up doing was ignoring the error, here is the code if you wanted to improve upon it. This is in fsmonitor_inotify.py:
(path, filename) = os.path.split(pathname)
if event_type == FSMonitor.DELETED:
# Build tuple for deletion of row in PathScanner's DB.
t = (path, filename)
self.fsmonitor_ref.pathscanner_files_deleted.append(t)
else:
try:
# Build tuple for PathScanner's DB of the form (path, filename,
# mtime), with mtime = -1 when it's a directory.
st = os.stat(pathname)
is_dir = stat.S_ISDIR(st.st_mode)
if not is_dir:
mtime = st[stat.ST_MTIME]
t = (path, filename, mtime)
else:
t = (path, filename, -1)
# Update PathScanner's DB.
if event_type == FSMonitor.CREATED:
self.fsmonitor_ref.pathscanner_files_created.append(t)
else:
self.fsmonitor_ref.pathscanner_files_modified.append(t)
except OSError, e:
print "Caught a OSError, probably a file doesn't exist"
print pathname
Pull request submitted after testing and verifying this change has solved the issue.
https://github.com/wimleers/fileconveyor/pull/144