nanopolish icon indicating copy to clipboard operation
nanopolish copied to clipboard

nanopolish index/call_methylation

Open tsassa opened this issue 4 years ago • 7 comments

Hi , I have a couple of problems about nano polish index and call_methylation.

I have a sequenced data from PromethION , 71Gb of fastq file basecalled by Guppy and 479Gb(761 files) of fast5_pass.

First , I got a error about nanopolish_index like below.

(base) 23:29:14 ~/Desktop/promethion0602/3w $ nanopolish index -d /Volumes/Seagate\ Expansion\ Drive/fast5_pass /Users/tatsurosassa/Desktop/promethion0602/3w/0602_3w.fastq

[readdb] indexing /Volumes/Seagate Expansion Drive/fast5_pass

error getting group name size

Is my fast5 files size too large to make index ? After I picked up about 70Gb data from fast5_pass , index was made , and I could get a methylation_called data.

Second , I could also make index from 140Gb data picked up from fast5_pass , but nanopolish call_mathylation didn't work like below. ("0602_3w_sort.bam" was mapped by minimap2. "chr3.fa" was downloaded from UCSC.)

(base) 13:34:34 ~/Desktop/promethion0602/3w $ nanopolish call-methylation -t 8 -r /Users/tatsurosassa/Desktop/promethion0602/3w/nanopolish_index200/0602_3w.fastq -b /Users/tatsurosassa/Desktop/promethion0602/3w/0602_3w_sort.bam -g /Users/tatsurosassa/Desktop/hg19/chr3.fa -w "chr3"> 0602_3w_chr3_index200_methylation_calls.tsv

[bam process] iterating over region: chr3

HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 123145356435456: #000: H5F.c line 509 in H5Fopen(): unable to open file major: File accessibilty minor: Unable to open file #001: H5Fint.c line 1400 in H5F__open(): unable to open file major: File accessibilty minor: Unable to open file 〜 〜 HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 123145304621056: #000: H5L.c line 805 in H5Lexists(): not a location major: Invalid arguments to routine minor: Inappropriate type #001: H5Gloc.c line 246 in H5G_loc(): invalid object ID major: Invalid arguments to routine minor: Bad value

[warning] fast5 file is unreadable and will be skipped: /Volumes/Seagate Expansion Drive/fast5_pickup2/PAE56649_pass_3201de12_140.fast5

[post-run summary] total reads: 209094, unparseable: 0, qc fail: 0, could not calibrate: 0, no alignment: 0, bad fast5: 209267

Is it because fast5 data is low quality? Best regards.

tsassa avatar Jul 01 '20 05:07 tsassa

Hi,

These errors indicate there are problems with the format of your fast5 files. Is this a relatively recent sequencing run?

Jared

jts avatar Jul 08 '20 13:07 jts

Thank you for your reply.

This data was sequenced on June 2, 2020. Guppy version was 3.2.10.

Best regards.

tsassa avatar Jul 09 '20 05:07 tsassa

I don't think we've had any issues with nanopolish and recent versions of guppy. The hdf5 format which fast5 is a subset of hasn't changed, and your failures are in opening the fast5 files as hdf5 files.

What it looks like is an issue with either permissions or the files having been moved.

oneillkza avatar Aug 14 '20 22:08 oneillkza

I came across the same problem. @tsassa, could you find any solution for this? My data is not new but from a couple years ago (579 fast5_pass files) and after 5-6 hours into the indexing I get the error message: error getting group name size

pesteller avatar May 12 '21 11:05 pesteller

I came across the same problem. @tsassa, could you find any solution for this? My data is not new but from a couple years ago (579 fast5_pass files) and after 5-6 hours into the indexing I get the error message: error getting group name size

hey, @pesteller, I'm just having the same issue, when re-analysing some data which is a couple of years old I'm getting this error. Could you find a solution for it? Thx!

NuriaDiaz avatar Jan 27 '22 13:01 NuriaDiaz

How old is the data? Could you email me a link where I can download an example fast5 file?

jts avatar Jan 27 '22 14:01 jts

Hi, thanks for your reply. The data is almost 2 years old. The link where you can download an example fast5 is: https://usegalaxy.eu/u/diaz/h/fast5example I hope that you can access the link, sometimes data sharing with galaxy is tricky, let me know if you've issues. I think that there's a file/some files which are somehow corrupt, but I unfortunately don't know how to find it easily. I've tried one of the suggestions you've made to other people with the GDB, but I couldn't find anything useful. Thank you so much :)

NuriaDiaz avatar Feb 01 '22 13:02 NuriaDiaz