bactopia
bactopia copied to clipboard
[question] Tutorial for having Nanopore reads only in Bactopia documentation
Hi @rpetit3 ,
Thank you so much for upgrading Bactopia
to v2 which included supports for only Nanopore reads! We have been waiting for a very long time!
May I know is there a tutorial part in the Bactopia
documentation which will include the analysis for only Nanopore reads? (e.g. bactopia prepare
for Nanopore only reads)
Thank you very much!
Totally! I'll put this on my list for updating the docs
Thank you so much!
Hi Robert, Is there any update regarding ONT reads? I'm processing some metagenomic samples that were sequenced through ONT. Each time I run bactopia the only output file generated is sample-genome-size.txt and the run fails. Then I run bactopia on a single sample, same output (barcode01-genome-size.txt) and the run fails again. The genome size is 14594200 lower than the default max. Here is commands I used bactopia --sample barcode01 --SE barcode01.fastq.gz --datasets my_directory/datasets/ --outdir ${today}bactopia1Samp_output --max_cpus $SLURM_CPUS_PER_TASK Note on short read samples that command runs properly. Thanks, TJ
Next version I will be improving the documentation which will include better demos of nanopore data
Hi Robert, I have been trying the run bactopia on hydrid samples from illumina and minION All the samples are in the directory fastqs_hybrid ( 2 SE samples , and the 2 corresponding paired samples, 6 files in total) I used the following cammands: bactopia prepare fastqs_hybrid/ > fastqs_hybrid.txt
bactopia --fastqs fastqs_hybrid.txt --hybrid --datasets my_directory/datasets/ --outdir ${today}bactopia_outputHybrid --max_cpus ${SLURM_CPUS_PER_TASK} --cleanup_workdir
However I got this error message: ERROR: "CFI21000051" has paired and single-end FASTQs, please check. ERROR: "CFI21000216" has paired and single-end FASTQs, please check.
- Each SE sample has the same root name as the corresponding paired, like CFI21000051.fastq.gz (CFI21000051_1.fastq.gz, CFI21000051_2.fastq.gz). Could that be the problem? I had the names different before, but SE only output genome_size.text files.
- Are there any other options I should add to the command? Thanks, TJ
Tack on a --long_reads
to the bactopia prepare
command
bactopia prepare fastqs_hybrid/ --long_reads > fastqs_hybrid.txt
Let me know how that works for you
Speaking of this, do you think you would ever have interest in running hybrid assemblies when the ONT reads are assembled then polished with Illumina?
Your point really makes sense. So, does bactopia allow that - polishing after the assembly? Indeed, my goal is the compare the results between hybrid and homogeneous assemblies. That's why I am investigating this. Thanks
At the moment no, Bactopia only supports hybrid assembly via Unicycler (assemble with spades then polish with long reads).
However, I recent added support for the reverse (assemble with long reads, then polish with short reads) in Dragonflye (https://github.com/rpetit3/dragonflye/releases/tag/v1.0.9). I was planning to float this support up to Bactopia as well.
Was just curious on your end if its something you'd like to see added.
I welcome, the recommendation. Part of work will be assembling a complete or near-complete genomes for the organisms I am working on. So, when we do hydrid, does polishing become unnecessary? Or are they two options we can choose from? I'm building experience...
Thanks!
In hybrid assemblies, you are either polishing with short reads or long reads. Polishing is also happening in standard assemblies as well. If you use Shovill or Dragonflye (Bactopia uses both), all assemblies are polished by default.
So yeah you would have two options for hybrid assembly:
- assemble with short reads, polish with long reads
- assemble with long reads. polish with short reads
I think you might have a fun little experiment on your hands! You could compare the outcomes of the two approaches, to see if one leads to better outcomes, or they are similar.
Currently Bactopia's --hybrid
is short read first, polish with long reads (This is done using Unicycler). But if you are interested, I'll get a dev version for you that allows you to do the opposite (long read first, polish with short reads).
Oh, yes! This kind of hybridization (long read first, polish with short reads) sounds interesting. We are getting more and more long reads to process. I appreciated you getting back so quickly. Thanks a lot!
Awesome! I'll be in touch in a few days with an update, excited to see what conclusions to come too
Hi Robert, When you run bactopia on one hydrid sample ( mysample.fastq and mysample_1.fastq, mysample_2.fastq). How many output directories to expect?
- I imagine just one ( bactopia_output/mysample)
- But I would like to make sure I am right. -Thanks, TJ
Yeah just one should be right
In the documentation, I notice we can run bactopia on assembly files (.fasta). Is it a good idea to run bactopia with the assembly files(in the assembly sub-directory) generated by bactopia itself? Thanks!
I would say no. This feature was really meant for samples from NCBI Assembly or when the FASTQs aren't available but an assembly is (e.g. some older studies/projects)
Hi Robert,
Is there any update about running bactopia on assembly files? I ran bactopia to process some local .fasta files, but had no output. The error log file is empty though.
Here is the command:
for s in $(cat samples.txt);do
bactopia --sample ${s} --assembly assemblies_zipped/${s}.fasta.gz --assembly_pattern *.fasta.gz --datasets /color/my_directory/datasets/ --outdir ${today}bactopia_output --max_cpus $SLURM_CPUS_PER_TASK --cle
anup_workdir
done
Thanks!
TJ
@tbazilegith I released v2.1.0 (https://github.com/bactopia/bactopia/releases/tag/v2.1.0) which now allows long read assembly with short-read polishing (--short_polish
). Let me know if there are any questions and issues!
Cheers, Robert