aviary icon indicating copy to clipboard operation
aviary copied to clipboard

Error with Vamb

Open aljazdzy opened this issue 7 months ago • 0 comments

I was running "recover" using a previously assembled file long-read file. Overall pretty happy with the outputs! But for some reason Vamb generated bins and yet according to the logs:

bin selection complete: 568 bins above score threshold selected.
extracting bins to data/das_tool_bins_pre_refine/das_tool_DASTool_bins
das_tool.log (END)INFO:root:Using the following binners: [('vamb_bins/', 'fna', 'data/vamb_bins.tsv'), ('rosella_refined/final_bins/', 'fna', 'data/rosella_refined_bins.tsv'), ('semibin_refined/final_bins/', 'fna', 'data/semibin_refined_bins.tsv'), ('metabat_bins_sspec', 'fa', 'data/metabat_sspec_bins.tsv'), ('metabat_bins_ssens', 'fa', 'data/metabat_ssens_bins.tsv'), ('metabat_bins_sens', 'fa', 'data/metabat_sens_bins.tsv'), ('metabat_bins_spec', 'fa', 'data/metabat_spec_bins.tsv'), ('metabat2_refined/final_bins/', 'fna', 'data/metabat2_refined_bins.tsv')]
grep: data/vamb_bins//*.fna: No such file or directory

When I go into the data/vamb_bins folder this is what I see:


bins  clusters.tsv  done  latent.npz  lengths.npz  log.txt  mask.npz  model.pt  tnf.npz

"bins" is a folder with the vamb bins.

There's also an error regarding the vamb_bins.tsv file in the dastool log:

grep: data/vamb_bins//*.fna: No such file or directory
WARNING:root:Bin definition file data/vamb_bins.tsv is empty, suggesting that vamb_bins/ failed or did not not create any output bins.
INFO:root:Bin definition files created: ['data/rosella_refined_bins.tsv', 'data/semibin_refined_bins.tsv', 'data/metabat_sspec_bins.tsv', 'data/metabat_ssens_bins.tsv', 'data/metabat_sens_bins.tsv', 'data/metabat_spec_bins.tsv', 'data/metabat2_refined_bins.tsv']
INFO:root:Running DAS_Tool with command: DAS_Tool --search_engine diamond --write_bin_evals 1 --write_bins 1 -t 16 --score_threshold -42         -i data/rosella_refined_bins.tsv,data/semibin_refined_bins.tsv,data/metabat_sspec_bins.tsv,data/metabat_ssens_bins.tsv,data/metabat_sens_bins.tsv,data/metabat_spec_bins.tsv,data/metabat2_refined_bins.tsv         -c /home/aljazdzy/miniconda3/envs/aviary/assembly_2.fasta         -o data/das_tool_bins_pre_refine/das_tool >> logs/das_tool.log 2>&1
Running DAS Tool using 16 threads.

Indeed when I observe the vamb_bins.tsv file it is empy.

The end of the log.txt file in the vamb_bins folder has the following:

  Encoding to latent representation
        Trained VAE and encoded in 15307.5 seconds

Clustering
        Windowsize: 200
        Min successful thresholds detected: 20
        Max clusters: None
        Min cluster size: 1
        Use CUDA for clustering: False
        Separator: None

        Clustered 93061 contigs in 60802 bins
        Clustered contigs in 632.37 seconds

Writing FASTA files
        Minimum FASTA size: 200000

        Wrote 11286 contigs to 459 FASTA files
        Wrote FASTA in 9.52 seconds

Completed Vamb in 15961.93 seconds

Which indicates to me that it was successful in creating bins.

Do you have any advice and/or ideas how to fix this issue?

aljazdzy avatar Jul 13 '24 22:07 aljazdzy