dada2 icon indicating copy to clipboard operation
dada2 copied to clipboard

Question about using DADA2 for Metazoan study

Open ucbtmae opened this issue 2 years ago • 4 comments

Dear Dr Callahan,

I hope this message finds you well.

Thanks a lot for all the work and development you have done with this package. It is very useful for my PhD project, which focuses on eDNA metabarcoding from amphibians and fish (well, there is other stuff in there too) using mitochondrial 12S and 16S as molecular markers. I have been running DADA2 with your tutorial, and I have managed to get it working all the way until the AssignTaxonomy step. You recommend also trying the idTaxa bit from the DECIPHER package, but it seems that the instructions are built around trainers for bacterial sequences. I am still new to the bioinformatics bit of the project, but I wondered how to adapt the last bit to assign the taxonomy to the ASVs according to my project's needs. For example, is there a way to build a trainer specifically for metazoan 12S and another for metazoan 16S? Could I do it with sequences available from GenBank?

We are also working on setting up a reference database with our own Sanger sequences.

Best,

Alejandro

ucbtmae avatar Feb 11 '22 15:02 ucbtmae

Hi Alejandro, Glad that the package has been useful for you! For taxonomic assignment in DADA2, you should take a look at the taxonomic references page: https://benjjneb.github.io/dada2/training.html

The reference fasta you give to assignTaxonomy completely determines what can and will be assigned to your sequences. Unfortunately, none of the references we currently have compiled are for mito 12S or metazoan 16S. So, to use assignTaxonomy for those makers, you will need to assemble a custom database, or translate available databases for those makers into the compatible format for assignTaxonomy, see bottom section of the taxonomic references page.

Also, if there is an established tool/reference for assigning taxonomy to e.g. mito 12S, you can also consider incorporating that outside tool into your workflow. Sometimes that can be easier than reshaping everything around assignTaxonomy format.

benjjneb avatar Feb 15 '22 15:02 benjjneb

Hi @ucbtmae, I am touching base to check whether there is any development here, or if you have found a reliable reference db for 12S :)

Thank you,

Domenico

domenico-simone avatar Mar 23 '22 12:03 domenico-simone

Hi @benjjneb and @domenico-simone

I have been working with this, and I have managed to create custom databases (only with mitochondrial sequences) from GenBank regarding my target amphibian and fish groups. Then I began testing them with the assignSpecies function with the preliminary sequencing data I currently have. And it works!

Although the program needs a 100% match between the sequences to be assigned, right? It is a bit stringent with my current databases, if this is correct. For example, when I BLAST one ASV with a 99% percentage identity, it does make an assignation while dada2 brings up a NA.

One of my current aims is to create a custom reference database from the beginning (extracting DNA from preserved tissues) with just the markers I am using.

Cheers!

Alejandro

ucbtmae avatar Mar 23 '22 12:03 ucbtmae

Although the program needs a 100% match between the sequences to be assigned, right? It is a bit stringent with my current databases, if this is correct. For example, when I BLAST one ASV with a 99% percentage identity, it does make an assignation while dada2 brings up a NA.

Yes this is right. assignSpecies is a very particular method based on exact matching and only exact matching: https://benjjneb.github.io/dada2/assign.html#species-assignment

We developed assignSpecies for the particular problem of assigning species-level taxonomy to short-read 16S sequencing, and we still think it is an excellent method for that task. However, it often (usually?) won't be the right method for species assignment in other taxonomic groups and markers, as there may not be the same rough quantitative match between single-nucleotide variation in the marker-gene and species-level taxonomy.

benjjneb avatar Apr 06 '22 01:04 benjjneb