java.io.IOException: User limit of inotify instances reached or too many open files
I have a simple bpipe as here
run {
"%_R*.fastq.gz" * [ fastqc + trim_fastq + alignment + [ flagstat_lane , BaseRecalibrator + PrintReads ] + move_lane_bam ] +
"%.move" * [ merge_lane_bam + [ RealignerTargetCreator + IndelRealigner , bigwig, samstat, qualimap, qualimap_dedup ] + HaplotypeCaller + make_join_calling_file ] +
[ GenotypeGVCFs + VariantRecalibrator_SNP + ApplyRecalibration_SNP + VariantRecalibrator_indel + ApplyRecalibration_indel ]
}
with 14 input samples. When I run this pipe, I always get the error as following:
java.io.IOException: User limit of inotify instances reached or too many open files
at bpipe.OutputDirectoryWatcher.
Any helps/suggestions/advice are appreciated. Thank you.
This is an unfortunate side effect of recent changes made to use native file system events (inotify) to know when files get created / modified.
You can turn off use of inotify with:
usePollerFileWatcher=true
in the bpipe.config (or ~/.bpipeconfig) and it should make this error go away. However it will also mean your Bpipe process is using more resources and is less efficient - so it may be worth exploring if you can have the system limit for inotify instances raised (see here for example).
Also, one aspect of this is that it registers one inotify instance per output directory. So if you are creating a huge number of directories as part of your pipeline, that can trigger the problem. If that isn't essential to your pipeline then you could possibly avoid the issue by creating less directories.
it may be worth exploring if you can have the system limit for inotify instances raised
This is complicated in that inotify doesn't work when dealing with shared filesystems (e.g. Lustre or NFS). We have been recommending usePollerFileWatcher=true.
Thank you for all useful advice!
Hi @ssadedin and @luuloi , some further detail on the environment underlying 'bpipe' here. bpipe is running in a virtual machine which is an NFS client. Bpipe is polling for files in this client's NFS mount. The NFS export on the server, which serves out the files that bpipe is polling for, is a Lustre mount that is being re-shared over NFS. Neither NFS nor Lustre support the inotify API.
Consequently, it's highly unlikely that the inotify API will work properly in this use case. Polling is the only way to discover new files.
It is possible that the inotify APIs succeed in the sense of returning 0 and not setting errno. I haven't tested this case. In other words, bpipe may have to deal with the possibility that the inotify API appears to work, but there are never any events.
The immediate problem in this case was that on the machine running bpipe, /proc/sys/fs/inotify/max_user_instances had value 128, which is not sufficient in this use case. That limit has now been increased to 1024. But as I was saying above, I believe that permitting bpipe to create additional inotify instances was pointless, because none of its inotify instances is ever going to work properly in this use case.
To make inotify work properly in this use case, there would have to be some sort of integration between Lustre change logs, NFS leases, and the NFS client's inotify watches, across a minimum of three different machines (Lustre MDT, Lustre client+NFS server, and NFS client). None of this exists right now.
If bpipe is receiving notifications in this use case, I suppose it must have fallen back to polling, despite being configured to use the 'native' watcher not the 'poller' watcher. Is this possible?
@MattSANU - I'm not entirely sure, but it is possible that Java itself falls back to a manual approach. Bpipe tries first to create a high sensitivity notification and then falls back to low sensitivity, which I think could possibly be emulated by Java. You can see the code / logic here and associated log messages that will tell you what's going on if you look in your log files:
https://github.com/ssadedin/bpipe/blob/master/src/main/groovy/bpipe/OutputDirectoryWatcher.groovy#L194