Daniel Cameron
Daniel Cameron
We need to define a convention for unstranded methylation data. Any objections to standardising on the position of the first modified base in the motif on the positive strand and...
We don't want VCF have to maintain it's own database of base modification short names. Is there database we can reference? Candidates include: - https://www.pmgenomics.ca/hoffmanlab/proj/dnamod/ - https://pubmed.ncbi.nlm.nih.gov/34893873/
Changes to the 4.5 draft located at https://github.com/samtools/hts-specs/pull/770
> This leads to an issue about how positions that are _not_ present in the VCF file are treated. Are they assumed to be uninformative (say, not covered by reads),...
`END` and `SVLEN` represent the same information. The new VCFv4.5 definition reflects this with `END` officially deprecated and redefined as computed field based on what's in `SVLEN`. Think of it...
The work-around of just running `bioconda-utils build` suggested to me on gitter is also problematic. After deleting my miniconda3 directory and reinstalling using the method in the bioconda documentation: ```...
It would be great if we defined the scope of the library. Specifically: - Is this a genomics kmer library, or a generic string kmer library? If it's a genomics,...
> I intend for the focus of this library to be efficient k-mer creation rust-debruijn appears to have a similar scope with specialised structs for small(ish) kmers. One consequence of...
Having both written my own kmer encoding scheme, and been burnt using libraries with different scheme, I thought I'd weigh in. There are actually 4 independent choices one needs to...
The approach I'm advocating has a few nice properties that aren't present in other kmer encoding schemes. - An encoded 2-bit stream/array of length k represents a big-endian number of...