LRBinner icon indicating copy to clipboard operation
LRBinner copied to clipboard

Computing Contig Length Error

Open cazzlewazzle89 opened this issue 3 years ago • 7 comments

Hi @anuradhawick

Sorry to keep inundating you with issues. I'm tying to bin contigs that I have assembled from ONT data using metaFlye but I am running into the following error (command used also provided).

Any advice would be greatly appreciated, Calum

LRBinner contigs --reads-path FastQ_Raw/WW_combined_porechopped.fastq --contigs Assemblies/WW_flye_auto/Medaka/consensus.fasta --output LRBinner_Output/WW/

2021-11-19 13:01:31,696 - INFO - Command /home/cwwalsh/Software/LRBinner/LRBinner contigs --reads-path FastQ_Raw/WW_combined_porechopped.fastq --contigs Assemblies/WW_flye_auto/Medaka/consensus.fasta --output LRBinner_Output/WW/
2021-11-19 13:01:31,697 - INFO - Computing contig lengths
Traceback (most recent call last):
  File "/home/cwwalsh/Software/LRBinner/LRBinner", line 197, in <module>
    main()
  File "/home/cwwalsh/Software/LRBinner/LRBinner", line 179, in main
    pipelines.run_contig_binning(args)
  File "/home/cwwalsh/Software/LRBinner/mbcclr_utils/pipelines.py", line 63, in run_contig_binning
    pickle.dump(contig_length, open(f"{output}/profiles/contig_lengths.pkl", "wb+"))
FileNotFoundError: [Errno 2] No such file or directory: 'LRBinner_Output/WW//profiles/contig_lengths.pkl'

cazzlewazzle89 avatar Nov 19 '21 02:11 cazzlewazzle89

Hi @cazzlewazzle89,

Thanks for the issue, much appreciated. I am currently building this part of LRBinner so a straight fix might take some time to come.

I suspect this may have something to do with the output path --output LRBinner_Output/WW/. Could you try to give a directory like --output LRBinner_Output_WW/ which is not nested. I do not recall LRBinner supporting nested paths.

However, I will keep this in mind and make fixes. I have not rigorously tested the contigs option of LRBinner as this was something beyond what I have presented in the paper. Let me know if this fixes the error, if not I will get into this ASAP.

Our research group has another binner MetaCOAG (preprint: https://www.biorxiv.org/content/10.1101/2021.09.10.459728v1) by @Vini2. I have actually tried to use the MetaCOAG idea in the deep learning model with her help. Just letting you know as this might be something worth looking into. MetaCOAG has been well tested on assembled contigs.

keep in touch!

Cheers Anuradha

anuradhawick avatar Nov 19 '21 03:11 anuradhawick

Thanks @anuradhawick

That fixed the issue I was having and the software appears to have run successfully. I assume the second column in the bins.txt file contains the (zero-based) binIDs into which to group the contigs?

Thanks for sharing the MetaCOAG software. I don't think it will be useful here as this is a nanopore-only dataset so I don't have short reads with which to calculate coverage. I will keep it in mind for future datasets though.

All the best, Calum

cazzlewazzle89 avatar Nov 21 '21 00:11 cazzlewazzle89

Yes. The bins file will have 0 based bins for each sequence in the input file.

let me know if there’s anything. Or any feedback on results.

Cheers Anuradha.

anuradhawick avatar Nov 21 '21 00:11 anuradhawick

Thanks @anuradhawick There are a few contigs omitted from the output (present in the multifasta input but not in the bins.txt output). Is that to be expected?

cazzlewazzle89 avatar Nov 21 '21 23:11 cazzlewazzle89

That is expected in current implementation. Because HDBSCAN output noise points and those are not binned. This is the main reason why I output bins.txt as a file containing seq id and bin id both.

Do you have any feedback on this? Is it worth improving this feature as a part of LRBinner in future?

anuradhawick avatar Nov 21 '21 23:11 anuradhawick

That makes perfect sense to me. I assumed they were unbinned/undetermined and would have expected this but just wanted to confirm. Thanks

cazzlewazzle89 avatar Nov 21 '21 23:11 cazzlewazzle89

What could be the reason of this error:

2022-07-25 22:50:09,153 - INFO - Command /storage/home/hcoda1/4/jzhao399/p-ktk3-0/miniconda3/envs/lrbinner/bin/LRBinner contigs --reads-path ../min17.noHost.sup1K.fastq --k-size 4 --threads 12 --output LRBinner --contigs assembly.fasta 2022-07-25 22:50:09,155 - INFO - Computing contig lengths Traceback (most recent call last): File "/storage/home/hcoda1/4/jzhao399/p-ktk3-0/miniconda3/envs/lrbinner/bin/LRBinner", line 197, in main() File "/storage/home/hcoda1/4/jzhao399/p-ktk3-0/miniconda3/envs/lrbinner/bin/LRBinner", line 179, in main pipelines.run_contig_binning(args) File "/storage/coda1/p-ktk3/0/jzhao399/rich_project_bio-konstantinidis/apps/LRBinner/mbcclr_utils/pipelines.py", line 63, in run_contig_binning pickle.dump(contig_length, open(f"{output}/profiles/contig_lengths.pkl", "wb+")) FileNotFoundError: [Errno 2] No such file or directory: 'LRBinner/profiles/contig_lengths.pkl'

I follow exactly the install guide to install and create condo environment.

Thanks,

Jianshu

jianshu93 avatar Jul 26 '22 03:07 jianshu93