DRAM icon indicating copy to clipboard operation
DRAM copied to clipboard

Selection logic for gene_id in amg_summery.tsv

Open diego00012138 opened this issue 8 months ago • 1 comments

Hi The annotation table gives a range of database comparison results for each gene, but only some of the results are retained in amg_summery.tsv, e.g. when hitting both kegg and pfam only the id of pfam may be retained in amg_summery, and in fact only two id of three pfam id keep. What kind of filtering strategy is this?

annotation.tsv:

k141_110713__full-cat_1_10 final-viral-combined-for-dramv k141_110713__full-cat_1 10 11206 12321 -1 C K01711 GDPmannose 4,6-dehydratase [EC:4.2.1.47] YP_009323158.1 YP_009323158.1 nucleotide-sugar epimerase [Synechococcus phage S-CAM7] False 0.509 323.0 6.7e-97 GDP-mannose 4,6 dehydratase [PF16363.10]; NAD dependent epimerase/dehydratase family [PF01370.26]; RmlD substrate binding domain [PF04321.22] VOG23305 sp|Q9EQC1|3BHS7_MOUSE 3 beta-hydroxysteroid dehydrogenase type 7; Xh Xh 0 1 2 False MK

amg_summery.tsv:

k141_110713__full-cat_1_10 PF04321 k141_110713__full-cat_1 2 MK RmlD substrate binding domain amg_database Roux et al. 2016 False k141_110713__full-cat_1_10 PF01370 k141_110713__full-cat_1 2 MK NAD dependent epimerase/dehydratase family amg_database Roux et al. 2016 False

Sincerely

diego00012138 avatar Jun 17 '24 13:06 diego00012138