nanopolish icon indicating copy to clipboard operation
nanopolish copied to clipboard

Update nanopolish v0.13.3 in anaconda

Open egirard1 opened this issue 2 years ago • 11 comments

Hi there, Would it be possible to udpate the version of nanopolish working in conda please ? Best, Elodie

egirard1 avatar Oct 04 '21 08:10 egirard1

Hi,

I'm going to make the next release after I merge in the methylation_bam branch, which I expect to do in the next week or two.

Jared

jts avatar Oct 04 '21 13:10 jts

Hi Jared,

We use the Nextflow-ARTIC pipeline to obtain consensus fasta sequences for SARS-CoV2 samples. We've updated the software on the Nanopore sequencing computer, but have noticed now that the vcf files from nanopolish contain no variants now, so that the subsequent consensus fasta sequence just matches the Wuhan reference (MN908947.3).

The version of nanopolish in bioconda is 0.13.2, so I was wondering if the latest version 0.13.3 of nanopolish might fix this, and if it might be possible to update bioconda with this latest 0.13.3 version of nanopolish.

The nanopolish command used with the Nextflow-ARTIC pipeline is:

nanopolish variants --verbose --min-flanking-sequence 10 -x 1000000 --progress -t 1 --reads barcode01.fastq -o barcode01.nCoV-2019_1.vcf -b barcode01.trimmed.rg.sorted.bam -g primer-schemes/nCoV-2019/V3/nCoV-2019.reference.fasta -w "MN908947.3:1-29904" --ploidy 1 -m 0.15 --read-group nCoV-2019_1

Thank you, Stephen.

sbridgett avatar Dec 14 '21 20:12 sbridgett

Hi @sbridgett,

There is no difference between 0.13.2 and 0.13.3 with respect to variant calling, so I suspect it isn't the cause of this issue. My first guess is that the FAST5 files are VBZ-compressed, but you don't have the VBZ decompression plugin loaded. Is this possible? You can read more about VBZ compression here: https://github.com/jts/nanopolish/issues/932#issuecomment-914303734 and here: https://github.com/nanoporetech/vbz_compression#vbz-compression

Jared

jts avatar Dec 14 '21 20:12 jts

Thank you for replying so quickly. It might be that the updated Nanopore software writes VBZ-compressed fast5 files now. I'll look into the VBZ decompression plugin, although the nanopolish command used in that step of the Nextflow-ARTIC pipeline that writes the vcf file, only reads from .fastq, .bam and .fasta files, not a .fast5 file, so I'm not sure why it would need VBZ decompression at this step:

nanopolish variants --verbose --min-flanking-sequence 10 -x 1000000 --progress -t 1 --reads barcode01.fastq -o barcode01.nCoV-2019_1.vcf -b barcode01.trimmed.rg.sorted.bam -g primer-schemes/nCoV-2019/V3/nCoV-2019.reference.fasta -w "MN908947.3:1-29904" --ploidy 1 -m 0.15 --read-group nCoV-2019_1

The input "barcode01.fastq" file contains 324,882 reads.

And the "barcode01.trimmed.rg.sorted.bam" file has 29215 (of the 30000 reference bases) covered at mean depth of 97.7 reads:

$ samtools coverage barcode01.trimmed.rg.sorted.bam
#rname	startpos	endpos	numreads	covbases	coverage	meandepth	meanbaseq	meanmapq
MN908947.3	1	29903	91751	29215	97.6992	1023.17	19.2	60

sbridgett avatar Dec 14 '21 22:12 sbridgett

Nanopolish always reads the fast5 files, in this case it uses the index files for the fastq to determine which ones to load.

Jared

On Dec 14, 2021, at 5:12 PM, Stephen Bridgett @.***> wrote:

 Thank you for replying so quickly. It might be that the updated Nanopore software writes VBZ-compressed fast5 files now. I'll look into the VBZ decompression plugin, although the nanopolish command used in that step of the Nextflow-ARTIC pipeline that writes the vcf file, only reads from a .fastq and .bam files, not a .fast5 file, so I'm not sure why it needs VBZ decompression at this step:

nanopolish variants --verbose --min-flanking-sequence 10 -x 1000000 --progress -t 1 --reads barcode01.fastq -o barcode01.nCoV-2019_1.vcf -b barcode01.trimmed.rg.sorted.bam -g primer-schemes/nCoV-2019/V3/nCoV-2019.reference.fasta -w "MN908947.3:1-29904" --ploidy 1 -m 0.15 --read-group nCoV-2019_1

The input "barcode01.fastq" file contains 324,882 reads.

And the "barcode01.trimmed.rg.sorted.bam" file has 29215 (of the 30000 reference bases) covered at mean depth of 97.7 reads:

$ samtools coverage barcode01.trimmed.rg.sorted.bam #rname startpos endpos numreads covbases coverage meandepth meanbaseq meanmapq MN908947.3 1 29903 91751 29215 97.6992 1023.17 19.2 60 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

jts avatar Dec 14 '21 22:12 jts

Sorry I hadn't realised that Nanopolish also reads the fast5 files when weren't given in the command-line parameters.

You're right, that VBZ-compression of the fast5 files is the cause.

I checked with the lab and the problem has started after the MinION software had been updated on the nanopore computer, and from the MinION release notes the VBZ-compression is enabled by default now.

I installed the latest 0.13.3 version of Nanopolish, and it exit with an error message explaining about the missing plugin:

"The fast5 file is compressed with VBZ but the required plugin is not loaded. Please read the instructions here: https://github.com/nanoporetech/vbz_compression/issues/5"

However, the 0.13.2 version of Nanopolish in conda, running with the same command on the same files, doesn't exit with this error, but continues running, and finishes with the message:

[post-run summary] total reads 89710, unparseable: 0, qc fail: 0, could not calibrate: 0, no alignment: 0, bad fast5: 44855"

but the resulting vcf file has no SNPs.

I've installed that hdf5plugin as per the instructions in that comment

With the plugin library filename at the end of the path, ie:

export HDF5_PLUGIN_PATH=/home/myusername/ont-vbz-hdf-plugin-1.0.1-Linux/usr/local/hdf5/lib/plugin/libvbz_hdf_plugin.so

the Nanopore 0.13.3 still exits with the same error message above.

When I removed the library filename from the path:

export HDF5_PLUGIN_PATH=/home/myusername/ont-vbz-hdf-plugin-1.0.1-Linux/usr/local/hdf5/lib/plugin

then both Nanopolish 0.13.3 and the 0.13.2 ran okay and produced vcf file containing the expected SNPs.

Thank you for your help with this. Much appreciated.

sbridgett avatar Dec 15 '21 20:12 sbridgett

Perhaps on the Nanopolish README.md, in the "Installing the latest code from github (recommended)" section, it might be worth adding a note about fast5 files being VBZ-compressed since the recent MinION software update, and so need to install the hdf5plugin and set the 'HDF5_PLUGIN_PATH' path to enable Nanopolish to read these files.

sbridgett avatar Dec 15 '21 20:12 sbridgett

Thank you for the detailed report, and for pointing out the path in the comment I linked to is incorrect, I have fixed that. One of the differences between 0.13.2 and 0.13.3 is that 0.13.3 will warn when the plugin is missing whereas 0.13.2 will silently skip the data.

Perhaps on the Nanopolish README.md, in the "Installing the latest code from github (recommended)" section, it might be worth adding a note about fast5 files being VBZ-compressed since the recent MinION software update, and so need to install the hdf5plugin and set the 'HDF5_PLUGIN_PATH' path to enable Nanopolish to read these files.

I'll make a note along these lines when I release 0.14 (likely in January). This is a common issue so I'll try to devise a way to automatically install the plugin, if possible.

jts avatar Dec 15 '21 20:12 jts

Yes, if nanopolish release 0.14 could automatically install the hdf5plugin with nanopolish that would be good.

In conda, currently when I install nanopolish, in a new environment, using: conda install -c bioconda -c conda-forge nanopolish it does install version nanopolish 0.13.2, but doesn't install the hdf5plugin.

If I then run: conda install -c bioconda -c conda-forge hdf5plugin it downgrades nanopolish to version 0.12.5 to install hdf5plugin-2.1.2.

Instead, installing hdf5plugin using the instructions you gave, and setting the HDF5_PLUGIN_PATH, does work okay.

sbridgett avatar Dec 21 '21 21:12 sbridgett

Hi Jared,

since you have merged the methylation_bam branch, do you plan to update the anaconda nanopolish version to 0.14?

Thank you,

Mattia

mfurla avatar May 06 '22 12:05 mfurla

Yes, I'll try to do this. Thanks for the reminder.

jts avatar May 06 '22 13:05 jts