bonito icon indicating copy to clipboard operation
bonito copied to clipboard

Issue on reading fast5 files

Open uuwuyi opened this issue 2 years ago • 9 comments

Hi, I used pip install ont-bonito to install v0.5.0. I run bonito using this command: bonito basecaller [email protected] ./barcode02 --device cuda > ./barcode02/basecalls_barcode02.fastq. However, it seemed that it did not read the fast5 files as the log file is like this:

> loading model [email protected]
> outputting unaligned fastq
^M> calling: 0 reads [00:00, ? reads/s]

Can you please help me figure out what is the problem of the fast5 files? Thank you!

uuwuyi avatar Dec 07 '21 16:12 uuwuyi

Hi @ugobananas

It's not obvious what the issue is here - I do see a control character in the output ^M, does it hang if you try again?

iiSeymour avatar Dec 17 '21 11:12 iiSeymour

Hi @iiSeymour

Thanks for your reply! I tried a few times and they all hung up. I'm using CUDA10.2 and bonito0.5.0. Model downloads didn't return any errors. Below are some job outputs.

Job Output Follows ...
===============================================================================
> loading model [email protected]
> outputting unaligned fastq
^M> calling: 0 reads [00:00, ? reads/s]^M                                     ^M> completed reads: 0
> duration: 0:00:01
> samples per second 0.0E+00
> done
==============================================================================
Running epilogue script on pink52.

Submit time  : 2021-12-06T12:46:56
Start time   : 2021-12-06T16:46:04
End time     : 2021-12-06T16:47:10
Elapsed time : 00:01:06 (Timelimit=2-12:00:00)
Job Output Follows ...
===============================================================================
> loading model [email protected]
> outputting unaligned fastq
^M> calling: 0 reads [00:00, ? reads/s]slurmstepd-pink53: error: *** JOB 856779 ON pink53 CANCELLED AT 2021-12-07T18:04:21 DUE TO TIME LIMIT ***
==============================================================================
Running epilogue script on pink53.

Submit time  : 2021-12-07T16:04:03
Start time   : 2021-12-07T16:04:11
End time     : 2021-12-07T18:04:21
Elapsed time : 02:00:10 (Timelimit=02:00:00)

uuwuyi avatar Dec 20 '21 11:12 uuwuyi

Hi @iiSeymour

Below is the detailed error. Can I do anything to solve it? I also tried to install v0.4.0 by pip install ont-bonito==0.4.0, but it failed to build the wheel. Any suggestions on this? Thanks for your help!

> loading model [email protected]
> model basecaller params: {'batchsize': 512, 'chunksize': 10000, 'overlap': 500, 'quantize': None}
> outputting unaligned fastq
^M> calling: 0 reads [00:00, ? reads/s]Exception in thread Thread-1:
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/site-packages/bonito/fast5.py", line 257, in get_raw_data_for_read
    return Read(f5_fh.get_read(read_id), filename)
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/site-packages/bonito/fast5.py", line 70, in __init__
    exp_start_dt = datetime.strptime(self.exp_start_time, "%Y-%m-%dT%H:%M:%S")
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/_strptime.py", line 352, in _strptime
    raise ValueError("unconverted data remains: %s" %
ValueError: unconverted data remains: .966747+00:00
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/site-packages/bonito/multiprocessing.py", line 110, in run
    for item in self.iterator:
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/site-packages/bonito/crf/basecall.py", line 61, in <genexpr>
    chunks = thread_iter(
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/site-packages/bonito/fast5.py", line 279, in get_reads
    for read in pool.imap(get_raw_data_for_read, job):
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
ValueError: unconverted data remains: .966747+00:00

uuwuyi avatar Jan 18 '22 13:01 uuwuyi

@ugobananas this issue is fixed on master https://github.com/nanoporetech/bonito/commit/c8417b7d0a7dbaa338983413ba5c6fbcd4163075 - are you able to build and run from source until v0.5.1 is released?

iiSeymour avatar Jan 18 '22 18:01 iiSeymour

Hi @iiSeymour

I tried the version built from source. Here is the error it returned:

> loading model [email protected]
> model basecaller params: {'batchsize': 512, 'chunksize': 10000, 'overlap': 500, 'quantize': None}
> outputting unaligned fastq
^M> calling: 0 reads [00:00, ? reads/s]Exception in thread Thread-3:
Traceback (most recent call last):
  File "/scratch/yw14n20/soft/miniconda3/envs/nanopore/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/mainfs/scratch/yw14n20/soft/bonito/bonito/multiprocessing.py", line 110, in run
    for item in self.iterator:
  File "/mainfs/scratch/yw14n20/soft/bonito/bonito/crf/basecall.py", line 69, in <genexpr>
    (read, compute_scores(model, batch, reverse=reverse)) for read, batch in batches
  File "/mainfs/scratch/yw14n20/soft/bonito/bonito/crf/basecall.py", line 35, in compute_scores
    sequence, qstring, moves = beam_search(
  File "/mainfs/scratch/yw14n20/soft/bonito/venv3/lib/python3.8/site-packages/koi/decode.py", line 13, in beam_search
    raise TypeError('Expected fp16 but received %s' % scores.dtype)
TypeError: Expected fp16 but received torch.float32

uuwuyi avatar Jan 19 '22 09:01 uuwuyi

Which GPU are you running on? The beam_search decoder requires half-precision but it seems your GPU doesn't support it.

iiSeymour avatar Jan 19 '22 10:01 iiSeymour

Hi, I ran on GTX1080 Consumer GPUs. We also have Volta V100 Enterprise Compute GPUs available.

uuwuyi avatar Jan 19 '22 10:01 uuwuyi

thanks @ugobananas, the GTX1 080 doesn't have the support to run the decoder but the V100 is ideally suited.

iiSeymour avatar Jan 19 '22 10:01 iiSeymour

Thanks @iiSeymour, it works on V100!

uuwuyi avatar Jan 19 '22 19:01 uuwuyi