bonito icon indicating copy to clipboard operation
bonito copied to clipboard

Crashing due to errors locking files

Open kautto opened this issue 3 years ago • 7 comments

Running into some issues with the basecaller command. This occurs on multiple datasets, so I don't think it's anything run-specific (unless they're all corrupted in a similar way):


> loading model
> calling: 406 reads [04:07,  2.19 reads/s]Exception in thread Thread-1:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/path/to/miniconda3/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/path/to/apps/bonito/bonito/fast5.py", line 139, in get_raw_data_for_read
    with get_fast5_file(filename, 'r') as f5_fh:
  File "/path/to/miniconda3/lib/python3.8/site-packages/ont_fast5_api/fast5_interface.py", line 13, in get_fast5_file
    return MultiFast5File(filepath, mode)
  File "/path/to/miniconda3/lib/python3.8/site-packages/ont_fast5_api/multi_fast5.py", line 13, in __init__
    self.handle = h5py.File(self.filename, self.mode)
  File "/path/to/miniconda3/lib/python3.8/site-packages/h5py/_hl/files.py", line 424, in __init__
    fid = make_fid(name, mode, userblock_size,
  File "/path/to/miniconda3/lib/python3.8/site-packages/h5py/_hl/files.py", line 190, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 96, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 22, error message = 'Invalid argument')
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path/to/miniconda3/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/path/to/apps/bonito/bonito/multiprocessing.py", line 67, in run
    for item in self.iterator:
  File "/path/to/apps/bonito/bonito/util.py", line 181, in batchify
    for k, v in items:
  File "/path/to/apps/bonito/bonito/crf/basecall.py", line 99, in <genexpr>
    ((read, chunk(signal, chunksize, overlap, pad_start=True)) for (read, signal) in reads)
  File "/path/to/apps/bonito/bonito/crf/basecall.py", line 94, in <genexpr>
    reads = (
  File "/path/to/apps/bonito/bonito/fast5.py", line 151, in get_reads
    for read in pool.imap(get_raw_data_for_read, job):
  File "/path/to/miniconda3/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
OSError: Unable to open file (unable to lock file, errno = 22, error message = 'Invalid argument')

(The /path/to/ has obviously been replaced in the message, that's not the actual path).

No other processes should be writing to the same location so I 'm not sure why it would be having a lock issue. Thoughts/help? Thanks!

kautto avatar Dec 07 '20 15:12 kautto

Is it possible that a none hdf5 file could be named *.fast5 under your directory structure or are you using any unusual characters in the filenames?

iiSeymour avatar Dec 07 '20 15:12 iiSeymour

Hi @iiSeymour,

I copied a small test set over and tried again, but it still crashed the same way. There aren't any odd names in the filenames, e.g.


total 338M
-rw-rw-rw- 1 kaut03 4294967294 74M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_36.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_360.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_361.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_362.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_363.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_364.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_365.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_366.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_367.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_368.fast5
-rw-rw-rw- 1 kaut03 4294967294 76M Oct  1  2019 FAL05049_2a5f483337b0b604383d7e357128d6991e885de7_369.fast5

And it's only those 11 .fast5 files in the folder.

kautto avatar Dec 07 '20 15:12 kautto

The filenames look okay - what filesystem are you using?

iiSeymour avatar Dec 07 '20 15:12 iiSeymour

Ubuntu 18.04.5

kautto avatar Dec 07 '20 15:12 kautto

Can you run stat --file-system --format=%T . in directory with the fast5 in.

iiSeymour avatar Dec 07 '20 15:12 iiSeymour

It's nfs

kautto avatar Dec 07 '20 15:12 kautto

Can you try with hdf5 file locking disabled -

$ export HDF5_USE_FILE_LOCKING=FALSE
$ bonito basecaller ...

iiSeymour avatar Dec 07 '20 15:12 iiSeymour