DRAM
DRAM copied to clipboard
Selection logic for gene_id in amg_summery.tsv
Hi The annotation table gives a range of database comparison results for each gene, but only some of the results are retained in amg_summery.tsv, e.g. when hitting both kegg and pfam only the id of pfam may be retained in amg_summery, and in fact only two id of three pfam id keep. What kind of filtering strategy is this?
annotation.tsv:
k141_110713__full-cat_1_10 final-viral-combined-for-dramv k141_110713__full-cat_1 10 11206 12321 -1 C K01711 GDPmannose 4,6-dehydratase [EC:4.2.1.47] YP_009323158.1 YP_009323158.1 nucleotide-sugar epimerase [Synechococcus phage S-CAM7] False 0.509 323.0 6.7e-97 GDP-mannose 4,6 dehydratase [PF16363.10]; NAD dependent epimerase/dehydratase family [PF01370.26]; RmlD substrate binding domain [PF04321.22] VOG23305 sp|Q9EQC1|3BHS7_MOUSE 3 beta-hydroxysteroid dehydrogenase type 7; Xh Xh 0 1 2 False MK
amg_summery.tsv:
k141_110713__full-cat_1_10 PF04321 k141_110713__full-cat_1 2 MK RmlD substrate binding domain amg_database Roux et al. 2016 False k141_110713__full-cat_1_10 PF01370 k141_110713__full-cat_1 2 MK NAD dependent epimerase/dehydratase family amg_database Roux et al. 2016 False
Sincerely