srst2
srst2 copied to clipboard
Different MLST result when re-running SRST2 after adding in new alleles
Howdy all,
So running into an odd issue. I am running a few K. pneumoniae samples against an MLST database using 0.2.0, and on first run, it generated the following best match:
It says the rpoB_135 allele hit has 1 SNP, and these are the coverage stats:
So I saved my new consensus fastas, and then added them back to my MLST database to allow for calling against these "new" alleles. When rerunning the same FASTQ file, I then got this result:
258 3 3 1 1 1 1 79
With these stats:
42.48 0.171428571429
Any ideas? The alleles for 258 should've been present in the earlier database, so not sure why I would get a 135* call on first run, with overall less depth of coverage (~22x) vs 42x on the repeat, with a clean hit against all ST258 alleles.
Best, S. Wesley Long
Hmmm, that's a interesting one. Our first hypothesis that is your reads are a mix of different genomes, as this has been the cause of weird SRST2 results in the past. So perhaps run some QC to see if you have a mixed sample?
It would be really informative if you could run SRST2 with --save_scores
for your two databases (before and after the new allele was added). Then we could take a look at the .scores
files. In particular, I'm curious how allele 1 scored in your first run and how allele 135 scored in your second run.
Mixed sample is most likely. These are all samples from clinical specimens, so not unusual to have a "community" of organisms.
Fairly busy at the moment but I will try to get the scores run on this particular example and let you know what they say.
(Accidental close, apologies)