Daniel Standage

Results 103 comments of Daniel Standage

I created an account (**standage**). I'd be happy to take ownership for now, and pass it on to someone else when I move on or to a lab-owned account if...

Hi @Nortalle! Our documentation is definitely intended for users of khmer's command-line tools, rather than developers using the Python API. Still, the concepts discussed on [this page](https://khmer.readthedocs.io/en/latest/user/choosing-table-sizes.html) are helpful. A...

Hi @taranglute. Your command looks mostly correct, although: - It's not typical to run commands from the `scripts/` directory, so the `./` prefix may not make sense. Did you follow...

Just to be clear: the benefits of banding are manifold and not discussed here. The benefit of using the 2-bit hash is that it's much faster than MurmurHash3, and compatible...

``` curl -L https://osf.io/f5trh/download?version=1 -o reads.fq.gz sandbox/count-band-single-pass.py \ --ksize 25 --num-bands 4 --buffersize 1e7 --memory 1e7 \ --outfmt "bandtest{}.ct" \ reads.fq.gz ``` The procedure above will result in the same...

We are investigating some use cases where we compute only on a single band, but there is at least one use case (motivated by kevlar and variant discovery) where we...

Another option is to simply ignore any k-mers containing a non-ACGT character. I don't know how this will affect performance of processing short reads, but it's essentially what I do...

> @standage in kevlar should do his own danged preprocessing if he doesn't want to use our bulk loading code ;) :) I use the bulk loading code for...loading count...

I'm adding some tests to kevlar to verify assumptions regarding non-ACGT are met. I'm getting the following error. ```python >>> import khmer >>> ct = khmer.Counttable(15, 1e6, 3) >>> ct.consume('TTGACTTGACTTCAG')...

Yeah, it we want performance *AND* flexibility we need to implement the different use cases at the C++ level. Using a performant and reasonable default for bulk loading but providing...