tbdb icon indicating copy to clipboard operation
tbdb copied to clipboard

Missing mutations from WHO Catalogue

Open mlarjim opened this issue 1 year ago • 10 comments

Hi! As far as I am concerned, tb-profiler database contains all the mutations that confer drug resistance listed in the WHO catalogue. However, the following mutation is not found in the tbdb https://github.com/jodyphelan/tbdb/blob/master/tbdb.csv

Gene: gid Mutation: gid_347_ins_1_cgcacgatctcaacggcc_ccgcacgatctcaacggca Literature Evidence: Catalogue of mutations in Mycobacterium tuberculosis complex and their association with drug resistance (WHO) conf_grade: 2) Assoc w R - Interim (STM_S)

Why is this variant missing?

mlarjim avatar Jul 13 '23 10:07 mlarjim

Some variants could not be translated as there were some issues the reference and alternate not agreeing with the rest of the variant description. In this case it should be an insertion of 1 nucleotide but if we align the reference and the alternate we see it is actually a combination of an insertion and a SNP:

c-gcacgatctcaacggcc
|*||||||||||||||||* 
ccgcacgatctcaacggca

There were a few of these cases

jodyphelan avatar Jul 13 '23 10:07 jodyphelan

Thank you for your remark, Jody. Effectively, the WHO catalogue is mistaken in the variant nomenclature. But the final annotation (column final_annotation.TentativeHGVSNucleotidicAnnotation) states that the mutation is actually a combination of a deletion and an insertion:

c.330_346delGGCCGTTGAGATCGTGCinsTGCCGTTGAGATCGTGCG

Is there any possibility that the tb-profiler database contemplates these cases?

mlarjim avatar Jul 13 '23 10:07 mlarjim

Oh right - I hadn't seen that they had added this hgvs notation now. I will take a look and see if I can include more of these cases.

jodyphelan avatar Jul 13 '23 11:07 jodyphelan

thank you so much!

mlarjim avatar Jul 13 '23 11:07 mlarjim

Hello,

I noticed that mutation fabG1 c.-16A>G in TBDB in only listed as conferring R-interim for INH while in WHO it also has the same prediction for ETH. Is there a reason why ETH prediction was not included?

Thank you! Varvara

frogtraveler avatar Sep 24 '23 00:09 frogtraveler

Another issue: rrl mutation detected by TBProfiler as n.-255C>T doesn't return match with TBDB though it is present in WHO with "Uncertain" confidence. Instead TBDB has mutation listed as c.-255C>T (also uncertain significance). Those are the same, right?

frogtraveler avatar Sep 24 '23 00:09 frogtraveler

Hi @frogtraveler ,

Indeed it looks like

  1. fabG1 c.-16A>G it missing for ETH
  2. rrl c.-255C>T should be listed as n.-255C>T

I'll get a new version of the db released this week and look for any other potential issues.

jodyphelan avatar Sep 25 '23 05:09 jodyphelan

Awesome! Thank you so much, Jody!

frogtraveler avatar Sep 26 '23 05:09 frogtraveler

Hi @frogtraveler,

I've regenerated the mutation lists based on the hgvs annotations from the WHO list now. The mutations you highlighted are now in:

  • fabG1 c.-16A>G: https://github.com/jodyphelan/tbdb/blob/1656b7fe64c5543d55cf52b55de67b2344e9564f/tbdb.csv#L171
  • rrl c.-255C>T: https://github.com/jodyphelan/tbdb/blob/1656b7fe64c5543d55cf52b55de67b2344e9564f/tbdb.other_annotations.csv#L5963

If you run tb-profiler update_tbdb they should be updated for you :)

jodyphelan avatar Oct 13 '23 10:10 jodyphelan

Thank you so much, Jody!

frogtraveler avatar Oct 18 '23 06:10 frogtraveler