dada2 icon indicating copy to clipboard operation
dada2 copied to clipboard

Reference database problem

Open safiqu opened this issue 2 years ago • 1 comments

Dear dada2 users,

I am trying to use our customised data set. Genus format is working for species fpormat is not working I tried a lot but i could not solve it.

Genus data format is: Bacteria;Actinobacteriota;Actinobacteria;Propionibacteriales;Propionibacteriaceae;Cutibacterium;Cutibacterium acnes_A1; GTTGCACACCAGGGGGTCAACTTGGCGTCCTCAGTTCAAAATTGATTCAAACTAACAGTTCCATGTCGGGAAACAGCACCAGGAAGCTCGTGACATATCGTCTTTCATTGCGAGAAACATCTTACTTATGTACATTTCTAAGCTATAGCGTCTACCCTTGTCAGACCCAGGACGATGGGTGTCACATCTCCTTTCTAGTCAACCTAAGAGAGGAGGAAATGCCGCGATATATGTTCCACCCTGTCATCACGAAGGCCACCACAATCTATCCCAGAACAGCCGGCACTTCACTCACGATGCCCCGATGCTGGATTCCTATTGTCGCCCTTATTAGGGCAAGCGGTGCCAGTAGCAGAATATGTCACCTCAACAACTCGATCCACCCCTGCCCATTACATGGGTAACATATCCATGGAGGTTCGATGTATACTCGAGGATACAGTCGTCCATCACGCCCGCCTACATACCCATTACATCAGCATAG

Bacteria;Actinobacteriota;Actinobacteria;Propionibacteriales;Propionibacteriaceae;Cutibacterium;Cutibacterium acnes_A10; GTTGCACACCAGGGGGTCAACTTGGCGTCCTCAGTTCAAAATTGATTCAAACTAACAGTTCCATGCCGGGAAACAGCACCAGGAAGCTCGTGACATATCGTCTTTCATTGCGAGAAACATCTTACTTATGTACATTTCTAAGCTATAGCGTCTACCCTTGTCAGACCCAGGACGATGGGTGTCACATCTCCTTTCTAGTCAACCTAAGAGAGGAGGAAATGCCGCGATATATGTTCCACCCTGTCATCACGAAGGCCACCACAATCTATCCCAGAACAGCCGGCACTTCACTCACGATGCCCCGATGCTGGATTCCTATTGTCGCCCTTATTAGGGCAAGCGGTGCCAGTAGCAGAATATGTCACCTCAACAACTCGATCTACCCCTGCCCATTACATGGGTAACATATCCATGGAGGTTCGATGTATACTCGAGGATACAGTCGTCCATCACGCCCGCCTACATACCCATTACATCAGCATAG

Species data Format is:

ID001 Cutibacterium acnes_A1 GTTGCACACCAGGGGGTCAACTTGGCGTCCTCAGTTCAAAATTGATTCAAACTAACAGTTCCATGTCGGGAAACAGCACCAGGAAGCTCGTGACATATCGTCTTTCATTGCGAGAAACATCTTACTTATGTACATTTCTAAGCTATAGCGTCTACCCTTGTCAGACCCAGGACGATGGGTGTCACATCTCCTTTCTAGTCAACCTAAGAGAGGAGGAAATGCCGCGATATATGTTCCACCCTGTCATCACGAAGGCCACCACAATCTATCCCAGAACAGCCGGCACTTCACTCACGATGCCCCGATGCTGGATTCCTATTGTCGCCCTTATTAGGGCAAGCGGTGCCAGTAGCAGAATATGTCACCTCAACAACTCGATCCACCCCTGCCCATTACATGGGTAACATATCCATGGAGGTTCGATGTATACTCGAGGATACAGTCGTCCATCACGCCCGCCTACATACCCATTACATCAGCATAG ID002 Cutibacterium acnes_A10 GTTGCACACCAGGGGGTCAACTTGGCGTCCTCAGTTCAAAATTGATTCAAACTAACAGTTCCATGCCGGGAAACAGCACCAGGAAGCTCGTGACATATCGTCTTTCATTGCGAGAAACATCTTACTTATGTACATTTCTAAGCTATAGCGTCTACCCTTGTCAGACCCAGGACGATGGGTGTCACATCTCCTTTCTAGTCAACCTAAGAGAGGAGGAAATGCCGCGATATATGTTCCACCCTGTCATCACGAAGGCCACCACAATCTATCCCAGAACAGCCGGCACTTCACTCACGATGCCCCGATGCTGGATTCCTATTGTCGCCCTTATTAGGGCAAGCGGTGCCAGTAGCAGAATATGTCACCTCAACAACTCGATCTACCCCTGCCCATTACATGGGTAACATATCCATGGAGGTTCGATGTATACTCGAGGATACAGTCGTCCATCACGCCCGCCTACATACCCATTACATCAGCATAG

I dont understand what I did wrong, suggestion please, Thank you in ana advance

safiqu avatar Apr 29 '22 11:04 safiqu

The way you have written the species names, every species is being considered as distinct. The text parser reads acnes_A1 and acnes_A10 as two difference species. You didn't state what problem you are having, but my guess is that is the root cause.

benjjneb avatar May 02 '22 14:05 benjjneb