tombo icon indicating copy to clipboard operation
tombo copied to clipboard

Resquiggle process stuck at 0

Open orangehe opened this issue 4 years ago • 13 comments

I came across a problem when using tombo command lines. The tombo resquiggle started and ran normally but the progree bar just stuck at 0 and no content was written into the single fast5 files.

I compared the .fast5 file with those .fast5 files that resquiggle command could be executed on (another set of data sequenced about half a year ago). Except for some version changes, main changes are the following: 1.attribute of /Raw/Reads/Read_/: a new attribute "end_reason" was added. 2.attribute of */UniqueGlobalKey/trackingid/: a new attribute "configuration version" was added.

I wonder if the stuck was caused by those changes and how the problem could be solved. Thanks!

orangehe avatar Jul 06 '20 02:07 orangehe

Could you post the full command and output text in order to help continue debugging this issue?

The stuck resquiggle process is unlikely to be due to these attribute changes. My best guesses at this point would be file access or searching issues.

marcus1487 avatar Jul 08 '20 18:07 marcus1487

The command and the output of the stuck resquiggle process are the following.

command ./tombo resquiggle ./fast5_pass_single/ ./Homo_sapiens/hg19_UCSC/20130422/Genome/chrALL.fa
--processes 8 --threads-per-process 2 --overwrite --num-most-common-errors 5

output text [09:31:32] Loading minimap2 reference. [09:33:22] Getting file list. [09:33:37] Loading default canonical ***** DNA ***** model. [09:33:43] Re-squiggling reads (raw signal to genomic sequence alignment). 0%| | 0/44000 [00:00<?, ?it/s] ...... 0%| | 0/44000 [24:11<?, ?it/s]

I sampled 11 multi-fast5 files from the original dataset and splitted them using ./multi_to_single_fast5 --input_path ./fast5_test/ --save_path ./fast5_pass_single/ --recursive -t 8 . It stuck at 0 for 24 minutes and no other error occurred. In fact, I ran the command on the full dataset before and it stuck at 0 for over 24 hours.

For your first guess, the mode strings of single .fast5 files and their parent directories are all -rw-r--r--. I think this it be OK for tombo to read and write on these files. What information should i check to see whether the stuck was caused by "searching issues"?

orangehe avatar Jul 09 '20 02:07 orangehe

The searching issues could be caused by for example a circular loop of linked directories that tombo may be recursively searching. But given that the progress bar indicates the total number of iterations it looks like this is not the issue.

I seem to remember some issues with numpy trying to run too many processes by default in some environments. Could you set export OMP_NUM_THREADS=1 before running the re-squiggle step and report the results.

Also does this happen even on a very small sample of reads? Could you test with say 100 reads to see if they complete processing in this environment?

marcus1487 avatar Jul 10 '20 13:07 marcus1487

Hi @marcus1487 . These days, I run tombo get the wrong message like this : $/anaconda3/bin/tombo resquiggle --overwrite ./0/ Homo_sapiens.GRCh38.genome.fa --ignore-read-locks --dna --processes 1 --q-score 6 [09:58:45] Loading minimap2 reference. [10:00:14] Getting file list. [10:00:16] Re-squiggling reads (raw signal to genomic sequence alignment). 0%| | 0/44 [00:00<?, ?it/s]Exception in thread Thread-1: Traceback (most recent call last): File "/anaconda3/lib/python3.7/threading.py", line 917, in _bootstrap_inner self.run() File "/anaconda3/lib/python3.7/threading.py", line 865, in run self._target(*self._args, **self._kwargs) File "/anaconda3/lib/python3.7/site-packages/ont_tombo-1.5-py3.7-linux-x86_64.egg/tombo/resquiggle.py", line 1636, in _io_and_mappy_thread_worker obs_filter, index_q, q_score_thresh, sig_match_thresh, std_ref) File "/anaconda3/lib/python3.7/site-packages/ont_tombo-1.5-py3.7-linux-x86_64.egg/tombo/resquiggle.py", line 1386, in _io_and_map_read all_raw_signal = th.get_raw_read_slot(fast5_data)['Signal'][:] File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "/anaconda3/lib/python3.7/site-packages/h5py/_hl/dataset.py", line 573, in getitem self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5d.pyx", line 182, in h5py.h5d.DatasetID.read File "h5py/_proxy.pyx", line 130, in h5py._proxy.dset_rw File "h5py/_proxy.pyx", line 84, in h5py._proxy.H5PY_H5Dread OSError: Can't read data (can't open directory: /usr/local/hdf5/lib/plugin)

hygine avatar Jul 13 '20 02:07 hygine

This looks to be due to the recent move to default vbz compression of raw signal. Please find details for addressing this issue here: https://github.com/nanoporetech/vbz_compression

marcus1487 avatar Jul 13 '20 03:07 marcus1487

@orangehe Did the above suggestions help resolve this issue for you?

marcus1487 avatar Jul 14 '20 16:07 marcus1487

@marcus1487 Hi, I set export OMP_NUM_THREADS=1 before running the resquiggle command but it didn't work. Else, i sampled 100 single fast5 into a new directory and ran resquiggle command on it but it still stuck at 0.

We've tried all solutions we could come up with. We are willing to share the subset of data with you if you would like to check it directly.

orangehe avatar Jul 15 '20 01:07 orangehe

Hi,All I am glad that I have found a way to solve this question. Set HDF5_PLUGIN_PATH in your envirment:export HDF5_PLUGIN_PATH=/usr/local/hdf5/lib/plugin(this is default,pelease change it to your own path).This solution comes for https://github.com/nanoporetech/vbz_compression/issues/5. Happy

pynie1 avatar Jul 15 '20 11:07 pynie1

@orangehe , I noticed that you have the --ignore-read-locks flag set. I'm wondering if you might have zombie processes still accessing the FAST5 files and blocking the current processes from accessing the files. On some systems, when a Tombo process is killed it leaves workers accessing the FAST5 files. I generally check this with htop -u [username].

I would also suggest potentially re-packing the FAST5 files if you have attempted to re-process (run resquiggle) many times on these files. HDF5 format does not do a great job of re-using space from deleted partitions so files can get quite large. This is a known issue with Tombo and future solutions will aim to rectify this issue, though it is quite core to how Tombo works and thus likely won't be addressed in the current implementation of Tombo.

marcus1487 avatar Jul 15 '20 18:07 marcus1487

@marcus1487 ,I run tombo resquiggle on another system.After set HDF5_PLUGIN_PATH, a error occurs: OSError: Can't read data (can't dlopen:/lib64/libc.so.6: version `GLIBC_2.14' not found. Then I set export LD_LIBRARY_PATH=./glibc/2.14/lib and re-run tombo resquiggle. There is nothing in log file.I can't see: Loading minimap2 reference.It maybe stuck and I can't know what it was doing.Do you have any explanation for this? Look forward to your reply.

pynie1 avatar Aug 07 '20 04:08 pynie1

@marcus1487 Hi, Marcus. I have the exact same issue as Orangehe described. The tombo resquiggle command stuck at 0%. I have not used --ignore-read-locks flag. I did not reprocess the fast5 file other than convert the raw multi fast5 files to single fast5.

Please find my command and the terminal output below: Command: tombo resquiggle ./test_data_single_fast5/ ./Genome/chrALL.fa --overwrite --num-most-common-errors 5 Terminal output: [14:43:24] Loading minimap2 reference. [14:45:03] Getting file list. [14:45:04] Loading default canonical ***** DNA ***** model. [14:45:05] Re-squiggling reads (raw signal to genomic sequence alignment).

0%| | 0/40000 [00:00<?, ?it/s]

0%| | 0/40000 [00:00<?, ?it/s]

I am using the tombo version 1.5 I have tested my command with my previous fast5 data generated from ONT machine before the recent update on the basecall software in July. I found no problem performing tombo resquiggle using those data. I suspect it is the problem with the recent ONT machine basecaller software update that leads to some changes in fast5 and the tombo resquiggle incompatibility.

If you wished to look into this problem, please check the dropbox link for a problematic raw fast5 file as an example.

https://www.dropbox.com/s/anyhnxoynvyp9ez/PAF07232_pass_a161ba96_1300.fast5?dl=0

Looking forward to your reply. Thanks for your time!

yemingx avatar Sep 17 '20 11:09 yemingx

This looks as though it may have to do with VBZ compression and the HDF5 plugin. Have you installed the plugin?

marcus1487 avatar Sep 22 '20 16:09 marcus1487

@yemingx @orangehe @pynie1 I'm having the same problem with resquiggle (i.e., stuck at 0 %). The HDF plugin is installed. Any other suggestions to get this to work. Thanks.

keenhl avatar Jan 09 '24 21:01 keenhl