Rachel Colquhoun
Rachel Colquhoun
When running the indep pipeline to create a massive vcf of variants in a groups of samples, get errors from bcf merge step. Here are some examples of pairs of...
So I think the problem is that the list_all_raw_vcfs file ends up empty. Because it exists, no error is thrown by combine_vcfs.pl and instead it hangs. I believe that the...
I want to do a SW alignment and get both a score, and a cigar as output. I have tried the following two methods using the latest pip installed version...
Handle by disallowing alleles in non-match intervals which are entirely N. The result/consequence is that we need to allow for fewer sequences to be considered after running kmeans_clustering. Also add...
I'd really like to use sourmash for metagenomic classification as in portik et al. I have been trying it out on small datasets and I've noticed that the gather step...
All VCFs need the following in their header: ``` ##FORMAT= ##FORMAT= ##FORMAT= ##FORMAT= ##FORMAT= ##FORMAT= ##FORMAT= ##FORMAT= ##FORMAT= ##FORMAT= ``` as well as `##contig=` for each $id in the CHROM...
Currently outputs a pangenome matrix file with a dodgy name (missing /?) Currently outputs a directory for every gene found in any sample - make this one for everything.
In cluster finding/filtering use collinearity to avoid some spurious hits
Currently pandora has two models allowed for kmer coverage distribution: negative binomial and binomial (approx poisson). Default is negative binomial. In https://github.com/rmcolq/pandora/blob/7adf63ce60d28a000f7b6c850f4a5ebbbf2dd031/src/estimate_parameters.cpp#L235, if I find that the mean and variance...
Pandora currently uses crude thresholds based on the estimated global coverage as compared to the mode or mean coverage along the path. This should probably be based on some intuition...