Chris Tomkins-Tinch

Results 30 issues of Chris Tomkins-Tinch

add `bbmap.BBMapTool().dedup_clumpify()`, along with unit tests; pass `JVMmemory` to bbmap and clumpify; add `rmdup_clumpify_bam` to `read_utils.py`; change `TestRmdupUnaligned `unit tests for bbmap to use `read_utils.py::rmdup_clumpify_bam`; add `dedup_bam` WDL task to...

To address https://github.com/broadinstitute/viral-classify/issues/1, this adds a new command, `krakenuniq_report_filter`, to `metagenomics.py`: ``` usage: metagenomics.py subcommand krakenuniq_report_filter [-h] [--fieldToFilterOn {num_reads,uniq_kmers}] [--fieldToAdjust {num_reads,uniq_kmers} [{num_reads,uniq_kmers} ...]] [--keepAboveN KEEP_THRESHOLD] [--loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL,EXCEPTION}] [--version] [--tmp_dir TMP_DIR]...

The old mvicuna post-processing code expects read IDs to have a `/1` mate suffix, and only includes those that do, which single-end reads do not have (nor interleaved fastqs, but...

The command line function `fastq_to_bam` exists as part of `read_utils.py`, however it is not currently exposed via WDL. A task to call the function should be added to `tasks_read_utils.wdl` and...

The WDL 1.0 spec is now out and supported by Cromwell. The [WDL workflows](https://github.com/broadinstitute/viral-ngs/tree/ct-vphaser-wdl/pipes/WDL/workflows) of viral-ngs should be updated to adhere to this spec. Specs: * https://github.com/openwdl/wdl/blob/master/versions/draft-2/SPEC.md * https://github.com/openwdl/wdl/blob/master/versions/1.0/SPEC.md **Note**...

This updates pytest to 5.2.0, which [requires](https://docs.pytest.org/en/latest/py27-py34-deprecation.html) Python 3.5+. With Python 2.7 [EOL in 2020](https://legacy.python.org/dev/peps/pep-0373/#id4), we can upgrade pytest to 5.2.0 once we remove Python 2.7 from our build matrix....

In preliminary testing, clumpify is much faster than mvicuna, but does not seem to remove as many reads with the settings I tried (subs=5 to match mvicuna and passes=4). On...