MUFFIN
MUFFIN copied to clipboard
Question on assembly strategy
Hello,
Maybe I have missed this somewhere but I am a bit confused on how multiple samples are handled in the pipeline or should be specified as input. In my dataset I have 1 samples with nanopore reads and 8 samples with illumina metagenomic reads. I assumed that all MG reads + all nanopore reads are assembly together and then the illumina reads are used for differential coverage. This seems not to be the case as the pipeline exits (successfully) after only doing preprocessing when I supply individual files. When supplying multiple samples, is the hybrid assembly meant to run on each sample separately (requiring nanopore+illumina data for each sample)?
The workflow figure suggests that it should be possible to supply different read sets for the assembly and the differential coverage, but I did not find the possibility to do so in the options. In that case I could run the assembly on the combination of all reads and then use the sample-specific read files for coverage.
Thanks in advance for your help.
Hello,
Sorry for the delayed answer.
At the moment, it is not possible to run multiple samples at the same time, you need to provide Illumina and Nanopore data of 1 sample at a time. If you use the flye method, the nanopore sample will be assembled then polished with the provided Illumina sample.
To do the differential binning you need to use the following arguments
--extra_ill a list of additional ill sample file (with full path with a * instead of _R1,2.fastq) to use for the binning in Metabat2 and concoct
--extra_ont a list of additional ont sample file (with full path) to use for the binning in Metabat2 and concoct
In which you provide the path to the files you want to use in the differential binning.
I will update the README to indicate these options clearly.
If you have further questions, do not hesitate to ask