dada2
dada2 copied to clipboard
mergeSequenceTables()
Hi Ben,
I have to analyze >20 runs and for each run I used (separately) dada2. I would like to combine them using mergeSequenceTables(). Many times, the different persons that did the library, used in 2 or 3 different sequencing runs the same sample as it did not work properly. Instead of using repeats="sum", I wonder if there would be a simple way to select the sample (from those that are duplicated) with highest number of reads. Not sure if that makes sense.
Thanks for your help!
Nico
We haven't implemented that logic, but this is a reasonable enhancement request to add a repeats="deepest"
mode that performs as you describe.
For now, you can do this in R, but will need to write a bit of custom code for this purpose. You can pull out the duplicated samples names pretty easily:
all.sams <- as.vector(sapply(list(st1, st2, st3), rownames))
dupes <- unique(all.sams[duplicated(all.sams)])
Then you can loop over the sequence tables for each duplicated sample name and... maybe delete the samples (rows) from each table that isn't the highest depth. Then stick those sequence tables into mergeSequenceTables
Thanks! Will try that. I was thinking to create a new table with all duplicates, the total counts numbers for each of them and from which table they were extracted. Then removing those with lowest count number.
Hi Ben,
We found a way to remove duplicates but I just found a potential batch effect and I was planning to add an extra step to remove contaminant using our negative controls (blank). As I have multiple runs and then multiple blanks, what would be the best approach to remove them? I am also planning to use a PCR correction, what do you think?
Thanks for the help!
Nico
Hi Nico, On the topic of contaminants, you may want to take a look at our paper and decontam software package for dealing with contamination: https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-018-0605-2
That lays out our thinking on dealing with contaminants. If you have just one negative control per "batch", your options on what to do are a bit more limited than what we consider in the decontam paper. Probably you want to identify clear contaminant features (e.g. ASVs) from the single negative control in each batch, and remove those identified that way from all batches.
Hi Ben,
Thanks for this information! I wonder what would happen if - let-s say I have 3 negative controls - but one of them seems to have been cross-contaminated by another sample.
Cheers,
Nico
Le mar. 15 févr. 2022 à 11:26, Benjamin Callahan @.***> a écrit :
Hi Nico, On the topic of contaminants, you may want to take a look at our paper and decontam software package https://github.com/benjjneb/decontam for dealing with contamination: https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-018-0605-2
That lays out our thinking on dealing with contaminants. If you have just one negative control per "batch", your options on what to do are a bit more limited than what we consider in the decontam paper. Probably you want to identify clear contaminant features (e.g. ASVs) from the single negative control in each batch, and remove those identified that way from all batches.
— Reply to this email directly, view it on GitHub https://github.com/benjjneb/dada2/issues/1424#issuecomment-1040484442, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABY5D6G6WR6QE66WVSVFH2DU3J5BNANCNFSM5F24S2PA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
--
Nicolas Tromas PhD LS2N/Université de Montréal E-mail: @.*** @.***> Researchgate: NTromasPage https://www.researchgate.net/profile/Nicolas_Tromas Web: http://www.shapirolab.ca/
Then you can't just remove all the taxa you observe in the cross-contaminated negative control.