nanopolish
nanopolish copied to clipboard
call methylation_segmentation fault
Hi,i have some problem when i run nanopolish call-mythlation for GM12878. The indexing is normal, but in call-mythlation i met this issue:
nanopolish-0.14.0/nanopolish call-methylation -b GM12878/chr1/GM12878.sort.bam -g human_ref/GRCh38.p13.genome.fa -r GM12878/fast5/chr1_basecall/basecall.fastq -q dam -t 100 > GM12878/chr1/GM12878.methdam_2.tsv
Segmentation fault
I used the version both v0.13.2 and v0.14.0. the two version all have this issue above.
I followed the suggestion you provided:
nanopolish fast5-check -r reads.fastq
and got this :
The readdb file contains 143291 fast5 files
[fast5] ERROR: failed to open
[fast5] OK: opened /GM12878/fast5/chr1_basecall/workspace/Bham/FAB39043-3709921973/LomanLabz_PC_20160923_FNFAB39043_MN17250_mux_scan_Human_1D_ligation_R9_4_23348_ch101_read18_strand.fast5
[read] OK: found 134405 raw samples for ses
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr: __pos (which is 5) > this->size() (which is 3)
Aborted
but i cannot locate which fast5 file leads to the error
I find difference between GM12878 fast5 file and K562 fast5 file, which could be processed by nanopolish without error
K562:
/ Group
/Analyses Group
/Analyses/Basecall_1D_000 Group
/Analyses/Basecall_1D_000/BaseCalled_template Group
/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR}
/Analyses/Basecall_1D_000/Summary Group
/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group
/Analyses/Segmentation_000 Group
/Analyses/Segmentation_000/Summary Group
/Analyses/Segmentation_000/Summary/segmentation Group
/Raw Group
/Raw/Reads Group
/Raw/Reads/Read_27792 Group
/Raw/Reads/Read_27792/Signal Dataset {59935/Inf}
/UniqueGlobalKey Group
/UniqueGlobalKey/channel_id Group
/UniqueGlobalKey/context_tags Group
/UniqueGlobalKey/tracking_id Group
GM12878:
/ Group
/Raw Group
/Raw/Reads Group
/Raw/Reads/Read_146 Group
/Raw/Reads/Read_146/Signal Dataset {6969/Inf}
/UniqueGlobalKey Group
/UniqueGlobalKey/channel_id Group
/UniqueGlobalKey/context_tags Group
/UniqueGlobalKey/tracking_id Group
any useful suggestion would be appreciate.
It appears the GM12878 files are one-read-per-fast5 (the old format), rather than multi-fast5. As a test could you try to re-pack the fast5s using single_to_multi to see if it resolves the problem?
Or given Nanopolish 0.14.0 supports slow5, those fast5 can be attempted to be converted to slow5 first. slow5 f2s <single_fast5_dir> -d blow5_dir -a -p <num_processes>
. slow5tools f2s may be able to catch the problematic fast5 file perhaps.
Hi, sorry for the late reply, I have tried the method [single_to_multi], and used the new multi_fast5 files doing basecalling, mapping. Then I got this problem again when doing nanopolish.
HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140546462873344:
#000: H5A.c line 642 in H5Aread(): unable to read attribute
major: Attribute
minor: Read failed
#001: H5Aint.c line 661 in H5A_read(): datatype conversion failed
major: Attribute
minor: Unable to encode value
#002: H5T.c line 4816 in H5T_convert(): data type conversion failed
major: Attribute
minor: Unable to encode value
#003: H5Tconv.c line 3274 in H5T__conv_vlen(): can't read VL data
major: Datatype
minor: Read failed
#004: H5Tvlen.c line 891 in H5T_vlen_disk_read(): Unable to read VL information
major: Datatype
minor: Read failed
#005: H5HG.c line 622 in H5HG_read(): unable to protect global heap
major: Heap
minor: Unable to protect metadata
#006: H5HG.c line 262 in H5HG_protect(): unable to protect global heap
major: Heap
minor: Unable to protect metadata
#007: H5AC.c line 1320 in H5AC_protect(): H5C_protect() failed.
major: Object cache
minor: Unable to protect metadata
#008: H5C.c line 3574 in H5C_protect(): can't load entry
major: Object cache
minor: Unable to load metadata into cache
#009: H5C.c line 7954 in H5C_load_entry(): unable to load entry
major: Object cache
minor: Unable to load metadata into cache
#010: H5HGcache.c line 141 in H5HG_load(): bad global heap collection signature
major: Heap
minor: Unable to load metadata into cache
error reading attribute channel_number
nanopolish: src/nanopolish_squiggle_read.cpp:305: void SquiggleRead::load_from_raw(const Fast5Data&, uint32_t): Assertion `this->base_model[strand_idx] != NULL' failed.
I will try the "slow5 method soon to see if it resolves the problem."