toil
toil copied to clipboard
Toil uses `/var/run/user/$UID` if it sees it, whether or not it is actually available to Toil's current session
Hi,
I noticed that u are using /var/run/user/$UID
to put locally locks on files.
It creates problems on our cluster because the way our session are managed by PAM does initialize pam_systemd. I thinks it's a normal behaviour for non interactive session.
Indeed pam_systemd is the pam module responsible for creating the /run/user/$UID
as you can read here: https://manpages.debian.org/unstable/libpam-systemd/pam_systemd.8.en.htmt
/var/run
appears to be a symlink to /run
however I think it's something that can vary across linux distributions.
There is an interesting discussion about using or not /var/run/user/$UID
here: https://superuser.com/questions/1056926/is-var-run-user-uid-the-new-var-run-for-pid-files
As far as I understand this is not created by default on debian buster for non-interactive session.
Below are the file contents that I think I got from : https://packages.debian.org/fr/buster/libpam-runtime
$ grep 'systemd' /etc/pam.d/common-session
session optional pam_systemd.so
$ grep 'systemd' /etc/pam.d/common-session-noninteractive
$
As far as I understand, beyond pam service, /var/run/user/<uid>
can be created by any demon managed by systemd with root privileges, which is not the case of toil workers.
Given those elements, I would not recommand using /var/run/user/$UID
by default. I would not even recommand using it after checking that the folder exists because it might be a collision with an interactive session that is running but may stop at any time.
┆Issue is synchronized with this Jira Story ┆friendlyId: TOIL-1199
What problem exactly is this causing for you @jfouret? Are you observing interactive sessions trying to clean up /var/run/user/$UID
out from under a running Toil workflow in the wild?
I think the Right Way to do this is probably to not hardcode the path but to actually follow the spec and look at the XDG_RUNTIME_DIR
, because what we really need is exactly what that is specified to supply:
The directory MUST be on a local file system and not shared with any other system. The directory MUST by fully-featured by the standards of the operating system.
We can probably rely on having the variable set as indicating that we are in a session that is going to be allowed to use it.
For Toil clusters, where Kubernetes/Mesos aren't going to set up a proper XDG session, we could move the coordination directory to /run/toil
or /var/run/toil
, since we will have permission to set that up.
I think on HPC systems where we can't set up our own /run
subdirectory and we don't have a real session, we could use the temp directory for this, and just hope that the system isn't using an NFS temp directory? Anywhere where it worked before (and /var/run/user/$UID
reliably exists) really ought to send XDG_RUNTIME_DIR
, so falling back to the temp directory would only happen in places where we're currently falling back to the Toil work directory, in the temp directory, already.
We might also want to add some environment variables and options to let people configure where the coordination directory ought to be.
So, in summary, we need to go to https://github.com/DataBiosphere/toil/blob/a192aade86d66e196afe2ac0e96a34ff331c58eb/src/toil/common.py#L1273-L1298
And change it to use:
-
--coodinationDir
-
TOIL_COORDINATION_DIR
-
/var/run/toil/
-
XDG_RUNTIME_DIR
-
tempfile.gettempdir()
I have downgraded to version 5.6.0 rapidly to avoid the issue.
Sorry I did not keep the error but there was OSError exception related To File not Found and/or File permission issues. At the end the workflow does not succeed obviously. The /var/run/user/$UID
was not present.
I am using SLURM and neither the front end node nor the computing nodes did have /var/run/user/<uid>
. Changing the PAM config is possible however I found it critically better to let PAM config as distributed with debian packages for many system administration (and security) reasons.
Regarding the solution u are proposing the /var/run
or /run
has the following permission drwxr-xr-x
. Therefore a non-root user cannot do anything within without a priviligied systemd services managing it (such as PAM).
Furthermore to set up our own directory we need also root privileges on the system. In addition the manual
creation such as indicated below is not persistent. This may be "cleaned" by a "legit" systemd process such as PAM when the last interactive session is closed.
sudo mkdir /var/run/user/1001
sudo chown 1001:1001 /var/run/user/1001
XDG specs are related to X11 application or desktop application. Is it appropriate for an application such as toil, which is more likely executed on sever without a desktop interace, ?
I think many people are using toil without root privileges. To lock files I would go to places that must exists on every linux setup like:
-
/run/lock
(if o+w) -
/var/lock
(if o+w) -
~/
-
/tmp
To lock files maybe /var/lock
is a good alternative. On debian 10 it has o+w and /var/lock
is a symlink to /run/lock
see: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s09.html
Sorry I did not keep the error but there was OSError exception related To File not Found and/or File permission issues. At the end the workflow does not succeed obviously. The /var/run/user/$UID was not present.
Toil is supposed to tolerate /var/run/user/$UID
not being present, and not being allowed to create it. If it isn't present, it will try to create it, and fall back to another location if it can't. We bother trying to create it because on Kubernetes Toil usually has root in its container and the permissions required to make the directory.
I can't tell by looking at the get_toil_coordination_dir()
code what would be going wrong that would cause setup to fail with an unhandled OSError. Maybe the directory existence check is failing somehow? The rest of it is wrapped in try/except.
We can probably rewrite the whole piece and wrap it more aggressively with fallback code.
XDG specs are related to X11 application or desktop application. Is it appropriate for an application such as toil, which is more likely executed on sever without a desktop interace, ?
Our shared servers at UCSC seem to set XDG_RUNTIME_DIR
for SSH logins. It's possible not all SSH logins in all environments provide this, but it's worth using it if it is available. People also do a lot of Toil workflow testing on desktop machines, so behaving well under a desktop session is a thing we care about.
Trying /run/lock
is a good idea; it doesn't look to be world-writable on any of our UCSC servers, but it does seem to be on newer Debian-based systems. Plus a lot of what Toil puts in the coordination directory is lock files (though I think we also store some non-lockfile data?).
I don't think we can use ~/
though, because we need a place that is almost certainly not on NFS or another shared filesystem, and is local to the current machine. Shared home directories across machines is quite common, and we don't yet have a way to ask if a particular location is actually a shared or local filesystem.
Below is my error (toil 5.7.1), sorry it was not OSerror but FileNotFoundError. I think that on my particuliar system (based on debian 10) the pam_systemd is activated during authentication making /var/run/user/<uid>
temporarely available, or something like this.
[2022-07-28T23:51:24+0200] [MainThread] [I] [toil.job] Processing job 'CWLJob' ngs_primary_qc.1.fastqc kind-CWLJob/instance-omsk7igm v4
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/toil/deferred.py", line 190, in open
yield defer
File "/usr/local/lib/python3.7/dist-packages/toil/worker.py", line 407, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/usr/lib/python3.7/contextlib.py", line 119, in __exit__
next(self.gen)
File "/usr/local/lib/python3.7/dist-packages/toil/fileStores/nonCachingFileStore.py", line 90, in open
os.remove(self.jobStateFile)
FileNotFoundError: [Errno 2] No such file or directory: '/var/run/user/10000/toil/f3b7e4c50faa5c51b28113a0d866f374/tmpzvuk4rjw.jobState'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/toil/worker.py", line 407, in workerScript
job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
File "/usr/lib/python3.7/contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.7/dist-packages/toil/deferred.py", line 193, in open
self._runOrphanedDeferredFunctions()
File "/usr/local/lib/python3.7/dist-packages/toil/deferred.py", line 285, in _runOrphanedDeferredFunctions
for filename in os.listdir(self.stateDir):
FileNotFoundError: [Errno 2] No such file or directory: '/var/run/user/10000/toil/f3b7e4c50faa5c51b28113a0d866f374/deferred'
[2022-07-28T23:51:24+0200] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host compute-os-c16-1
And by the way on version 5.6.0 I have indeed OSerrors related to NFS that are already documented in others issues. Despite those errors the workflow completes succesfully. The reason of .nfsXXXX
files is documented here (http://nfs.sourceforge.net/#faq_d2).
I do not know how this error is handled but all files are finally deleted. Such .nfsXXXX
files that are not in use by another server are very ephemere in my experience (< 5 seconds).
[2022-07-29T18:07:32+0200] [MainThread] [E] [toil.deferred] [Errno 16] Device or resource busy: b'/exports/analysis/ICM_2022_04/ngs-primary-qc_workdir/466d86e3f5565a41b4159afd6ca3bfd1/deferred/.nfs0000000016dd0019000012b3'
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/toil/deferred.py", line 214, in cleanupWorker
robust_rmtree(os.path.join(stateDirBase, cls.STATE_DIR_STEM))
File "/usr/local/lib/python3.7/dist-packages/toil/lib/io.py", line 49, in robust_rmtree
robust_rmtree(child_path)
File "/usr/local/lib/python3.7/dist-packages/toil/lib/io.py", line 62, in robust_rmtree
os.unlink(path)
OSError: [Errno 16] Device or resource busy: b'/exports/analysis/ICM_2022_04/ngs-primary-qc_workdir/466d86e3f5565a41b4159afd6ca3bfd1/deferred/.nfs0000000016dd0019000012b3'
[2022-07-29T18:07:32+0200] [MainThread] [W] [toil.lib.threading] Could not clean up arena /exports/analysis/ICM_2022_04/ngs-primary-qc_workdir/9c9f5671-8d94-47b7-9c9f-258bb4071ec0-cleanup-arena-members completely: Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/toil/lib/threading.py", line 515, in leave
os.rmdir(self.lockfileDir)
OSError: [Errno 39] Directory not empty: '/exports/analysis/ICM_2022_04/ngs-primary-qc_workdir/9c9f5671-8d94-47b7-9c9f-258bb4071ec0-cleanup-arena-members'
I have a PR to try XDG_RUNTIME_DIR
if set, and then try /run/lock
, and then fall back to the Toil workDir
if that doesn't work (which in turn would be in the temp directory by default). Does that seem like it would work for you @jfouret?