TideHunter
TideHunter copied to clipboard
[abpoa_gen_cons] "Not enough sequences to perform msa"
So after successfully compiling TideHunter and AbPoa I'm running into this error when attempting to generate a consensus sequence:
[abpoa_gen_cons] "Not enough sequences to perform msa"
However, my sequence file is of length: 195588 (as measured by zcat < *.fq.gz | wc -l) so I feel like that should be more than enough sequence?
I am attempting to use a large gzipped fastq file as TideHunter input. As a minimally reproducible example, I will include on of the constitutent fastq files which is much smaller (FAR....12.fastq.gz):
FAR63237_pass_barcode01_9dc2df5e_12.fastq.gz
These fastqs were generated by the default MinKnow basecaller from FAST5 files produced by nanopore sequencing on a MinION 9.3.4 flow cell.
My execute command for tidehunter was:
./TideHunter-v1.5.3.2/bin/TideHunter -f 3 *.fastq.gz
and I also attempted
./TideHunter-v1.5.3.2/bin/TideHunter -f 3 $(zcat < *.fastq.gz)
which has not yet failed with this error but also seems to be taking much longer than anticipated to run.
Thanks for your help
Hi,
With the command
~/program/TideHunter/bin/TideHunter -f3 ./FAR63237_pass_barcode01_9dc2df5e_12.fastq.gz > out
It works normally on my machine.
Can you also provide the data which cause the error [abpoa_gen_cons] "Not enough sequences to perform msa"
?
Yan
I get the error with this file as well, but here is the file I am trying to do this on: (sending gdrive link because file too large to attach (~780 Mb)
https://drive.google.com/file/d/1r7nMrOjSGGxFJrVlDaqwXpkYzHffcJsH/view?usp=sharing
Also just to be precise, the error specifically says "[abpoa_gen_cons] No enough sequences to perform msa."
I think the issue might once again be with compilation if it works on your machine...
I re-downloaded the TideHunter repo (git clone --recursive) and the updated abPOA within it (also --recursive) and tried to rebuild after including <arm_neon.h> in simde instructions, and changing march=native to mcpu=apple-m1 as I had done previously. I needed to change 3 instances of "%ld" in the src/main.c to "%lld" to address several warnings, but then make fails again with the same error as last time:
Undefined symbols for architecture arm64: "_ksw_extz2_sse", referenced from: _ksw2_global in ksw2_align.o _ksw2_global_with_cigar in ksw2_align.o _ksw2_right_ext in ksw2_align.o _ksw2_left_ext in ksw2_align.o _ksw2_right_extend in ksw2_align.o _ksw2_left_extend in ksw2_align.o ld: symbol(s) not found for architecture arm64 clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [bin/TideHunter] Error 1
after doing make clean_all all armv8=1 aarch64=1 (which I believe are the correct flags for my Apple M1 Pro MBP)
Hi, with your 780MB data, I did re-produce the error. Let me look into it, and this may take some time.
Ok, thanks!
The larger file was produced by using zcat to combine several smaller fq.he files like so
Zcat *fastq.gz > newFile.fq.gz
Hope that is helpful!
Santiago On Mar 1, 2022, 2:49 AM -0800, Yan Gao @.***>, wrote:
Hi, with your 780MB data, I did re-produce the error. Let me look into it, and this may take some time. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>
Thanks for the bug push. I've re downloaded and compiled (successfully, with no alternations) the repo on my Ubuntu 20.x distro and TideHunter now successfully runs for the individual fastq files, but not the merged fastq (reproduces the error in the original issue comment). Odd behavior I dont quite understand. Hope this helps!
I did not see the error using mcf10a.fq.gz
.
Can you upload the merged data?
The Mcf10a file is a merged fastq.gz of 12 fq.gz files with the same barcode. The other Merged file is 2.1 Gb. Both produce the same error on my machine. Perhaps it’s because I installed w wget instead of git clone on my Linux distro?
On Mar 3, 2022, 7:31 PM -0800, Yan Gao @.***>, wrote:
I did not see the error using mcf10a.fq.gz. Can you upload the merged data? — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: @.***>
Maybe, just try to re-run it with git clone.