Rachel Colquhoun comments

Results 20 comments of


                                            Rachel Colquhoun

Error rate vs. genotyping error rate

Interesting. I think this must be a bug. I made the commit here https://github.com/rmcolq/pandora/commit/062e4ebe84424619241c59a14582536000422700#diff-23632e736a3562808e4b2d5bf5dc59c7 to add the `genotyping_error_rate` parameter. It was just a constant number before this. However I did...

Max coverage and unmapped reads

Interesting thoughts here! 1. Are short reads causing an issue at the moment? For illumina reads they should all be the same length and there is no problem. For nanopore,...

Max coverage and unmapped reads

The main purpose of this originally was 2 fold: Primary purpose was to control RAM use by placing a cap, secondary purpose was to downsample so the same threshold for...

genotype confidence is covg depenent - need normalisation of confidence

It makes absolute sense to normalise confidence scores so that we can set an absolute threshold for confidences and pretty much know our FP/FN rate at that level. I don't...

Estimate error rate

Some estimation is done - in the estimate_parameters.cpp script, if the coverage is sufficiently high, we fit a curve to the kmer coverage distribution and work from that.

Estimate error rate

Fix complication with compare situation - maybe not all the read datasets will have the same error rate.

Slow KmerGraph test

It takes ages because it's based on real cases where there were bugs at some point. At the time I didn't know what the bugs were. The shorter one would...

Slow KmerGraph test

From memory it was mostly bugs elsewhere not caught by other tests at the time. The length was not important but problem was with pruning edges which are surplus (arise...

How does sourmash gather time scale with reads (and can this be reduced with multithreading)?

Hi @ctb Thank you for this in depth answer! It sounds like fastmultigather gets a long way towards what we are looking for, so I'll definitely check it out. I...

How does sourmash gather time scale with reads (and can this be reduced with multithreading)?

Amazing, thanks for keeping this updated with the developments!