amptk icon indicating copy to clipboard operation
amptk copied to clipboard

add SILVA/greengenes DB to 16S amptk taxonomy

Open nextgenusfs opened this issue 5 years ago • 11 comments

problem is reformatting the taxonomy information in format needed by UTAX/SINTAX. I've previously looked at SILVA taxonomy -- the taxonomy appeared to be a hot mess (I'm not a bacteriologist).... So the challenge will be convert the taxonomy strings to proper format

example taxonomy strings:

>BOLD:ACI6695;tax=k:Animalia,p:Arthropoda,c:Insecta,o:Coleoptera,f:Elateridae,g:Nipponoelater,s:Nipponoelater babai
>S004604051;tax=k:Fungi,p:Basidiomycota,c:Agaricomycetes,o:Hymenochaetales,f:Hymenochaetaceae,g:Inonotus,s:Sanghuangporus zonatus
>S004127186;tax=k:Fungi,p:Ascomycota
>S004061552;tax=k:Fungi,p:Ascomycota,c:Eurotiomycetes,s:Pyrenula sanguinea

nextgenusfs avatar Oct 03 '18 17:10 nextgenusfs