SETH
SETH copied to clipboard
Incorrectly extracted mutations
Some patterns return only partial matches against a longer mutation. Need to extend these patterns or create new, longer ones that have precedence over the shorter matches.
Examples:
- PMID=20806047 occurrence=p.X320SerextX72 extracted=p.X320Ser
- PMID=23903049 occurrence=p.His33GInfsX32 extracted=p.His33G
- PMID=22907560 occurrence=p.Arg313Hys extracted=p.Arg313H
- PMID=18486607 occurrence=p.Arg315Stop extracted=p.Arg315S
- PMID=23017188 occurrence=p.Phe508Del extracted=p.Phe508D
- PMID=24158885 occurrence=p.Met694IIe extracted=p.Met694I
- PMID=23856132 occurrence=p.F55>Lfs extracted=p.F55>L
- PMID=18708425 occurrence=p.L15_L16ins2L extracted=p.L15_L16ins2
Thanks for the report. I added test-cases for the described errors here.
Some errors (3, 4, 5, 6) should be easy to fix. It seems that the parser stops too early in these cases. Other errors probably need some major adaption of the implemented Backus Naur grammar (e.g., 1,7,8). https://github.com/rockt/SETH/blob/master/src/test/java/de/hu/berlin/wbi/issues/Request10Test.java
Cool, thanks. I will look into this as well at some point.