Shaun Jackman

Results 459 comments of Shaun Jackman

I'd suggest picking the lexicographically smallest possible nucleotide for that ambiguity code rather than a random one, to make the result deterministic.

What's your use case, Cole? Is it that you have reads with `Ns` in them, or do you have reads with other IUPAC codes in them, or are you working...

Jared (@jts) is in a better position to answer that question than myself.

Sounds liked `sed 's/ BX:Z:/:/` would convert from Longranger basic FASTQ format to the format that EMA expects. Would you consider adding support for `BX:Z` format?

``` preproc: preprocess barcoded FASTQ files (takes interleaved FASTQ via stdin) -h: apply Hamming-2 correction [off] ``` Cool! That's useful to me. I'm curious why `-h` is disabled by default....

What is the output format of `ema preproc`? Would you consider adding an option to output `BX:Z` format?

Ah, I think I misunderstood. The default then is Hamming-1 correction? I had incorrectly assumed that the default is no correction. Perhaps you could update the README.md to clarify which.

> ema preproc produces a special output format that isn't quite FASTQ; it puts everything for a read pair on a single line, which is convenient. That format is sometimes...

Yes, the `outs/barcoded.fastq.gz` of `longranger basic` does appear to be sorted by `BX:Z`! I hadn't noticed that before! I've been running `samtools sort -tBX` to sort by barcode! Hah. Thanks...

Does `ema preproc` sort by barcode by default? Assuming not, can the output of `ema preproc` be piped into `samtools sort -tBX`?