hagfish icon indicating copy to clipboard operation
hagfish copied to clipboard

ValueError: cannot convert float NaN to integer

Open gawbul opened this issue 9 years ago • 7 comments

I'm trying to run hagfish_extract and am getting the following error:

[smoss@biolserva pacbio_assembly]$ hagfish_extract pbreads_to_pbasm_blasr.sorted.bam
/home/smoss/.local/lib/python2.7/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
/home/smoss/.local/lib/python2.7/site-packages/numpy/core/_methods.py:71: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
  File "/home/smoss/tools/hagfish/hagfish_extract", line 643, in <module>
    stats = doStats(bamBase, seqInfo, readPairs)
  File "/home/smoss/tools/hagfish/hagfish_extract", line 293, in doStats
    label='Peak top (%d)' % int(topInsert))
ValueError: cannot convert float NaN to integer

gawbul avatar Jun 23 '15 10:06 gawbul

I notice that topInsert is assigned as smids[top] in the code (where top is the indices of the maximum values in the histogram). In tracing this issue back I took the liberty of printing smids and am left with a list of nan. Printing mids also returns a list of nan. Printing out insertSizes, hist, and edges returns an empty list, a list of zeros and a list of nan respectively.

gawbul avatar Jun 23 '15 10:06 gawbul

I printed bamBase, seqInfo and readPairs and get the following:

pb_to_pb_blasr
{'scf7180000000002|quiver': {'length': 5350059}}
{'scf7180000000002|quiver': {'start2': array([], dtype=float64), 'start1': array([], dtype=float64), 'stop1': array([], dtype=float64), 'stop2': array([], dtype=float64)}} 

gawbul avatar Jun 24 '15 11:06 gawbul

Debug output here:

[smoss@biolserva pb_pbalign]$ hagfish_extract -vvv ../pb_to_pb_pbalign.bam 
HAGFISH INFO   processing bamfile pb_to_pb_pbalign
HAGFISH DEBUG  get sequence info from ../pb_to_pb_pbalign.bam
HAGFISH INFO   Reading cached seqInfo for pb_to_pb_pbalign
HAGFISH INFO   discovered 1 sequences
HAGFISH INFO   processing BAM file: ../pb_to_pb_pbalign.bam
HAGFISH INFO   Basename pb_to_pb_pbalign
HAGFISH INFO   Processing 1 sequences < 1000 nt (from a total of 1)
HAGFISH DEBUG  executing samtools
HAGFISH DEBUG     samtools view -f 67 ../pb_to_pb_pbalign.bam
HAGFISH INFO   discovered 0 readpairs (insert < 20000 nt) out of a total of 0
HAGFISH INFO   wroted data for 1 sequences with zero pairs
HAGFISH INFO   total no readpairs: 0
/home/smoss/.local/lib/python2.7/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
/home/smoss/.local/lib/python2.7/site-packages/numpy/core/_methods.py:71: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
HAGFISH DEBUG  stats {'average': nan, 'nopairs': 0, 'median': nan}
HAGFISH DEBUG  creating a histogram (0, nan)
HAGFISH INFO   insert size tops at nan
HAGFISH INFO   Estimating min ok insert size as nan
HAGFISH INFO   Estimating max ok insert size as nan
HAGFISH INFO   plotting normal figure
Traceback (most recent call last):
  File "/home/smoss/tools/hagfish/hagfish_extract", line 643, in <module>
    stats = doStats(bamBase, seqInfo, readPairs)
  File "/home/smoss/tools/hagfish/hagfish_extract", line 293, in doStats
    label='Peak top (%d)' % int(topInsert))
ValueError: cannot convert float NaN to integer

gawbul avatar Jun 24 '15 12:06 gawbul

It seems to work fine with short-read data from Illumina that I have mapped to the PacBio (PB) assembly using bwa, but for PB to PB mapping using pbalign/blasr it fails. This seems to be down to the samtools step?

gawbul avatar Jun 24 '15 13:06 gawbul

I changed the samFlag input flag to --samFlag=0 and now I am getting output. I'm not entirely sure how this impacts things downstream?

gawbul avatar Jun 24 '15 13:06 gawbul

Dear @gawbul - sorry - I was (so it appears) not paying any attention to this page - is this still relevant?

mfiers avatar Dec 22 '15 13:12 mfiers

@mfiers Not working on that project anymore, but was still an issue if I remember. I'm not sure if it was down to issues with the data, but haven't had time to investigate since then.

gawbul avatar Dec 22 '15 14:12 gawbul