Peter Krusche

Results 35 comments of Peter Krusche

Hi, I think hap.py should work with other datasets also -- you need these files minimally: * a truth VCF file * a query VCF file * a reference FASTA...

Here are my slides from the last benchmarking call related to this issue: https://docs.google.com/presentation/d/1VCguvdhaSJI0z7Vbn_oyBYdoYsMzqMyjlTIroHoLBks/edit?usp=sharing Also, the proposed output format in there is now supported by hap.py 0.3.0 and is documented...

Also, here are some comments w.r.t. the differences between hap.py and the metrics definitions document: - hap.py outputs _TRUTH_.TP and _QUERY_.TP. Since (VCF-based) counts can be different depending on the...

@bioinformed : about hap.py / xcmp: they will implement the new intermediate format soon, probably in February (it started out similar to what hap.py is writing, but changed during the...

In the matching case, if the comparison tool chooses to not split any input variants, I guess the only way to output the result is to print the records as...

Can you paste the column header line of that VCF file? Maybe there is a spurious tab / space at the end of th e `#CHROM ...` line?

I suspect that decomposition somehow doesn’t get the AD values quite right. I think the easiest way to fix this would be to drop the AD fields in advance using...

Hap.py allows to do this via the `--set-gt` option. ``` --set-gt {half,hemi,het,hom} This is used to treat Strelka somatic files Possible values for this parameter: half / hemi / het...

Another way to get an idea of the allele-only performance is to run in standard mode and use the FP.GT value. FP.GT gives the number of query calls which the...

For a 1/2 call, `--set-gt hom` would produce two 1/1 records, one for each allele. Of course these cannot be haplotype matched anymore if they overlap on the reference after...