Zamin Iqbal

Results 180 comments of Zamin Iqbal

Bah. Those files had some things that were non overlapping, but only if you think eg chr1 10 A G chr1 10 AG A I think those confused it. when...

OK, well , this is a bug I think. Current status, from ym point of view if there are no overlapping variants and onlt SNPs, then vcfcombine works very well,...

When you say 'simply' - is it a long compute job?

VCF attached [perl_generated_vcf.txt](https://github.com/iqbal-lab-org/gramtools/files/3245304/perl_generated_vcf.txt) some of these records are horrible. eg see 523027, which starts like this NC_000962.3 523027 . GCAACACC ACAAAAA,ACAAAAC,ACAAAACA,ACAAAACC,ACAAAACG,ACAAAACT,ACAAAAG,ACAAAAT,ACAAACCA,ACAAACCC,ACAAACCG,ACAAACCT,ACAACAA,ACAACAC,ACAACACA,ACAACACC,ACAACACG,ACAACACT,ACAACAG,ACAACAT,ACAACCCA,ACAACCCC,ACAACCCG,ACAACCCT,ACAAGAA,ACAAGAC,ACAAGACA,ACAAGACC,ACAAGACG,ACAAGACT,ACAAGAG,ACAAGAT,ACAAGCCA,ACAAGCCC,ACAAGCCG,ACAAGCCT,ACAATAA,ACAATAC,ACAATACA,ACAATACC,ACAATACG,ACAATACT,ACAATAG,ACAATAT,ACAATCCA,ACAATCCC,ACAATCCG,ACAATCCT,ACAGAAA,ACAGAAAACA,ACAGAAAACC,ACAGAAAACG,ACAGAAAACT,ACAGAAACCA,ACAGAAACCC,ACAGAAACCG,ACAGAAACCT,ACAGAAC,ACAGAACA,ACAGAACACA,ACAGAACACC,ACAGAACACG,ACAGAACACT,ACAGAACC,ACAGAACCCA,ACAGAACCCC,ACAGAACCCG,ACAGAACCCT,ACAGAACG,ACAGAACT,ACAGAAG,ACAGAAGACA,ACAGAAGACC,ACAGAAGACG,ACAGAAGACT,ACAGAAGCCA,ACAGAAGCCC,ACAGAAGCCG,ACAGAAGCCT,ACAGAAT,ACAGAATACA,ACAGAATACC,ACAGAATACG,ACAGAATACT,ACAGAATCCA,ACAGAATCCC,ACAGAATCCG,ACAGAATCCT,ACAGACCA,ACAGACCC,ACAGACCG,ACAGACCT,ACAGAGAACA,ACAGAGAACC,ACAGAGAACG,ACAGAGAACT,ACAGAGACCA,ACAGAGACCC,ACAGAGACCG,ACAGAGACCT,ACAGAGCACA,ACAGAGCACC,ACAGAGCACG,ACAGAGCACT,ACAGAGCCCA,ACAGAGCCCC,ACAGAGCCCG,ACAGAGCCCT,ACAGAGGACA,ACAGAGGACC,ACAGAGGACG,ACAGAGGACT,ACAGAGGCCA,ACAGAGGCCC,ACAGAGGCCG,ACAGAGGCCT,ACAGAGTACA,ACAGAGTACC,ACAGAGTACG,ACAGAGTACT,ACAGAGTCCA,ACAGAGTCCC,ACAGAGTCCG,ACAGAGTCCT,ACAGCAA,ACAGCAC,ACAGCACA,ACAGCACC,ACAGCACG,ACAGCACT,ACAGCAG,ACAGCAT,ACAGCCCA,ACAGCCCC,ACAGCCCG,ACAGCCCT,ACAGGAA,ACAGGAC,ACAGGACA,ACAGGACC,ACAGGACG,ACAGGACT,ACAGGAG,ACAGGAT,ACAGGCCA,ACAGGCCC,ACAGGCCG,ACAGGCCT,ACAGTAA,ACAGTAC,ACAGTACA,ACAGTACC,ACAGTACG,ACAGTACT,ACAGTAG,ACAGTAT,ACAGTCCA,ACAGTCCC,ACAGTCCG,ACAGTCCT,AGAAAAA,AGAAAAC,AGAAAACA,AGAAAACC,AGAAAACG,AGAAAACT,AGAAAAG,AGAAAAT,AGAAACCA,AGAAACCC,AGAAACCG,AGAAACCT,AGAACAA,AGAACAC,AGAACACA,AGAACACC,AGAACACG,AGAACACT,AGAACAG,AGAACAT,AGAACCCA,AGAACCCC,AGAACCCG,AGAACCCT,AGAAGAA,AGAAGAC,AGAAGACA,AGAAGACC,AGAAGACG,AGAAGACT,AGAAGAG,AGAAGAT,AGAAGCCA,AGAAGCCC,AGAAGCCG,AGAAGCCT,AGAATAA,AGAATAC,AGAATACA,AGAATACC,AGAATACG,AGAATACT,AGAATAG,AGAATAT,AGAATCCA,AGAATCCC,AGAATCCG,AGAATCCT,AGAGAAA,AGAGAAAACA,AGAGAAAACC,AGAGAAAACG,AGAGAAAACT,AGAGAAACCA,AGAGAAACCC,AGAGAAACCG,AGAGAAACCT,AGAGAAC,AGAGAACA,AGAGAACACA,AGAGAACACC,AGAGAACACG,AGAGAACACT,AGAGAACC,AGAGAACCCA,AGAGAACCCC,AGAGAACCCG,AGAGAACCCT,AGAGAACG,AGAGAACT,AGAGAAG,AGAGAAGACA,AGAGAAGACC,AGAGAAGACG,AGAGAAGACT,AGAGAAGCCA,AGAGAAGCCC,AGAGAAGCCG,AGAGAAGCCT,AGAGAAT,AGAGAATACA,AGAGAATACC,AGAGAATACG,AGAGAATACT,AGAGAATCCA,AGAGAATCCC,AGAGAATCCG,AGAGAATCCT,AGAGACCA,AGAGACCC,AGAGACCG,AGAGACCT,AGAGAGAACA,AGAGAGAACC,AGAGAGAACG,AGAGAGAACT,AGAGAGACCA,AGAGAGACCC,AGAGAGACCG,AGAGAGACCT,AGAGAGCACA,AGAGAGCACC,AGAGAGCACG,AGAGAGCACT,AGAGAGCCCA,AGAGAGCCCC,AGAGAGCCCG,AGAGAGCCCT,AGAGAGGACA,AGAGAGGACC,AGAGAGGACG,AGAGAGGACT,AGAGAGGCCA,AGAGAGGCCC,A

**Proposal 1** 1. Divide the PRG into chunks of length 10000bp (or whatever). Say this is N chunks 2. For each chunk calculate the list of kmers contained within (not...

** Proposal 2** As proposal 1, except when comparing chunk i and chunk j, (assume diploid, but easy to modify what I say for other ploidies) sample 100 (or some...

Impact on *Plasmodium falciparum* (key use case for us): will immediately remove the crazy repeat regions where we should not waste time trying to quasimap, or variant call. Less wasted...

BTW, above I said something like `Impact on Plasmodium falciparum (key use case for us) will immediately remove the crazy repeat regions` I have since got a workaround for the...