aviary
aviary copied to clipboard
Error with Vamb
I was running "recover" using a previously assembled file long-read file. Overall pretty happy with the outputs! But for some reason Vamb generated bins and yet according to the logs:
bin selection complete: 568 bins above score threshold selected.
extracting bins to data/das_tool_bins_pre_refine/das_tool_DASTool_bins
das_tool.log (END)INFO:root:Using the following binners: [('vamb_bins/', 'fna', 'data/vamb_bins.tsv'), ('rosella_refined/final_bins/', 'fna', 'data/rosella_refined_bins.tsv'), ('semibin_refined/final_bins/', 'fna', 'data/semibin_refined_bins.tsv'), ('metabat_bins_sspec', 'fa', 'data/metabat_sspec_bins.tsv'), ('metabat_bins_ssens', 'fa', 'data/metabat_ssens_bins.tsv'), ('metabat_bins_sens', 'fa', 'data/metabat_sens_bins.tsv'), ('metabat_bins_spec', 'fa', 'data/metabat_spec_bins.tsv'), ('metabat2_refined/final_bins/', 'fna', 'data/metabat2_refined_bins.tsv')]
grep: data/vamb_bins//*.fna: No such file or directory
When I go into the data/vamb_bins folder this is what I see:
bins clusters.tsv done latent.npz lengths.npz log.txt mask.npz model.pt tnf.npz
"bins" is a folder with the vamb bins.
There's also an error regarding the vamb_bins.tsv file in the dastool log:
grep: data/vamb_bins//*.fna: No such file or directory
WARNING:root:Bin definition file data/vamb_bins.tsv is empty, suggesting that vamb_bins/ failed or did not not create any output bins.
INFO:root:Bin definition files created: ['data/rosella_refined_bins.tsv', 'data/semibin_refined_bins.tsv', 'data/metabat_sspec_bins.tsv', 'data/metabat_ssens_bins.tsv', 'data/metabat_sens_bins.tsv', 'data/metabat_spec_bins.tsv', 'data/metabat2_refined_bins.tsv']
INFO:root:Running DAS_Tool with command: DAS_Tool --search_engine diamond --write_bin_evals 1 --write_bins 1 -t 16 --score_threshold -42 -i data/rosella_refined_bins.tsv,data/semibin_refined_bins.tsv,data/metabat_sspec_bins.tsv,data/metabat_ssens_bins.tsv,data/metabat_sens_bins.tsv,data/metabat_spec_bins.tsv,data/metabat2_refined_bins.tsv -c /home/aljazdzy/miniconda3/envs/aviary/assembly_2.fasta -o data/das_tool_bins_pre_refine/das_tool >> logs/das_tool.log 2>&1
Running DAS Tool using 16 threads.
Indeed when I observe the vamb_bins.tsv file it is empy.
The end of the log.txt file in the vamb_bins folder has the following:
Encoding to latent representation
Trained VAE and encoded in 15307.5 seconds
Clustering
Windowsize: 200
Min successful thresholds detected: 20
Max clusters: None
Min cluster size: 1
Use CUDA for clustering: False
Separator: None
Clustered 93061 contigs in 60802 bins
Clustered contigs in 632.37 seconds
Writing FASTA files
Minimum FASTA size: 200000
Wrote 11286 contigs to 459 FASTA files
Wrote FASTA in 9.52 seconds
Completed Vamb in 15961.93 seconds
Which indicates to me that it was successful in creating bins.
Do you have any advice and/or ideas how to fix this issue?