LRBinner
LRBinner copied to clipboard
Computing Contig Length Error
Hi @anuradhawick
Sorry to keep inundating you with issues. I'm tying to bin contigs that I have assembled from ONT data using metaFlye but I am running into the following error (command used also provided).
Any advice would be greatly appreciated, Calum
LRBinner contigs --reads-path FastQ_Raw/WW_combined_porechopped.fastq --contigs Assemblies/WW_flye_auto/Medaka/consensus.fasta --output LRBinner_Output/WW/
2021-11-19 13:01:31,696 - INFO - Command /home/cwwalsh/Software/LRBinner/LRBinner contigs --reads-path FastQ_Raw/WW_combined_porechopped.fastq --contigs Assemblies/WW_flye_auto/Medaka/consensus.fasta --output LRBinner_Output/WW/
2021-11-19 13:01:31,697 - INFO - Computing contig lengths
Traceback (most recent call last):
File "/home/cwwalsh/Software/LRBinner/LRBinner", line 197, in <module>
main()
File "/home/cwwalsh/Software/LRBinner/LRBinner", line 179, in main
pipelines.run_contig_binning(args)
File "/home/cwwalsh/Software/LRBinner/mbcclr_utils/pipelines.py", line 63, in run_contig_binning
pickle.dump(contig_length, open(f"{output}/profiles/contig_lengths.pkl", "wb+"))
FileNotFoundError: [Errno 2] No such file or directory: 'LRBinner_Output/WW//profiles/contig_lengths.pkl'
Hi @cazzlewazzle89,
Thanks for the issue, much appreciated. I am currently building this part of LRBinner so a straight fix might take some time to come.
I suspect this may have something to do with the output path --output LRBinner_Output/WW/
. Could you try to give a directory like --output LRBinner_Output_WW/
which is not nested. I do not recall LRBinner supporting nested paths.
However, I will keep this in mind and make fixes. I have not rigorously tested the contigs
option of LRBinner as this was something beyond what I have presented in the paper. Let me know if this fixes the error, if not I will get into this ASAP.
Our research group has another binner MetaCOAG (preprint: https://www.biorxiv.org/content/10.1101/2021.09.10.459728v1) by @Vini2. I have actually tried to use the MetaCOAG idea in the deep learning model with her help. Just letting you know as this might be something worth looking into. MetaCOAG has been well tested on assembled contigs.
keep in touch!
Cheers Anuradha
Thanks @anuradhawick
That fixed the issue I was having and the software appears to have run successfully. I assume the second column in the bins.txt file contains the (zero-based) binIDs into which to group the contigs?
Thanks for sharing the MetaCOAG software. I don't think it will be useful here as this is a nanopore-only dataset so I don't have short reads with which to calculate coverage. I will keep it in mind for future datasets though.
All the best, Calum
Yes. The bins file will have 0 based bins for each sequence in the input file.
let me know if there’s anything. Or any feedback on results.
Cheers Anuradha.
Thanks @anuradhawick
There are a few contigs omitted from the output (present in the multifasta input but not in the bins.txt
output). Is that to be expected?
That is expected in current implementation. Because HDBSCAN output noise points and those are not binned. This is the main reason why I output bins.txt
as a file containing seq id
and bin id
both.
Do you have any feedback on this? Is it worth improving this feature as a part of LRBinner in future?
That makes perfect sense to me. I assumed they were unbinned/undetermined and would have expected this but just wanted to confirm. Thanks
What could be the reason of this error:
2022-07-25 22:50:09,153 - INFO - Command /storage/home/hcoda1/4/jzhao399/p-ktk3-0/miniconda3/envs/lrbinner/bin/LRBinner contigs --reads-path ../min17.noHost.sup1K.fastq --k-size 4 --threads 12 --output LRBinner --contigs assembly.fasta
2022-07-25 22:50:09,155 - INFO - Computing contig lengths
Traceback (most recent call last):
File "/storage/home/hcoda1/4/jzhao399/p-ktk3-0/miniconda3/envs/lrbinner/bin/LRBinner", line 197, in
I follow exactly the install guide to install and create condo environment.
Thanks,
Jianshu