Chris Tomkins-Tinch
Chris Tomkins-Tinch
add `bbmap.BBMapTool().dedup_clumpify()`, along with unit tests; pass `JVMmemory` to bbmap and clumpify; add `rmdup_clumpify_bam` to `read_utils.py`; change `TestRmdupUnaligned `unit tests for bbmap to use `read_utils.py::rmdup_clumpify_bam`; add `dedup_bam` WDL task to...
based on tile count or flowcell suffix
To address https://github.com/broadinstitute/viral-classify/issues/1, this adds a new command, `krakenuniq_report_filter`, to `metagenomics.py`: ``` usage: metagenomics.py subcommand krakenuniq_report_filter [-h] [--fieldToFilterOn {num_reads,uniq_kmers}] [--fieldToAdjust {num_reads,uniq_kmers} [{num_reads,uniq_kmers} ...]] [--keepAboveN KEEP_THRESHOLD] [--loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL,EXCEPTION}] [--version] [--tmp_dir TMP_DIR]...
The old mvicuna post-processing code expects read IDs to have a `/1` mate suffix, and only includes those that do, which single-end reads do not have (nor interleaved fastqs, but...
This closes issue #1001
The command line function `fastq_to_bam` exists as part of `read_utils.py`, however it is not currently exposed via WDL. A task to call the function should be added to `tasks_read_utils.wdl` and...
The WDL 1.0 spec is now out and supported by Cromwell. The [WDL workflows](https://github.com/broadinstitute/viral-ngs/tree/ct-vphaser-wdl/pipes/WDL/workflows) of viral-ngs should be updated to adhere to this spec. Specs: * https://github.com/openwdl/wdl/blob/master/versions/draft-2/SPEC.md * https://github.com/openwdl/wdl/blob/master/versions/1.0/SPEC.md **Note**...
This updates pytest to 5.2.0, which [requires](https://docs.pytest.org/en/latest/py27-py34-deprecation.html) Python 3.5+. With Python 2.7 [EOL in 2020](https://legacy.python.org/dev/peps/pep-0373/#id4), we can upgrade pytest to 5.2.0 once we remove Python 2.7 from our build matrix....
In preliminary testing, clumpify is much faster than mvicuna, but does not seem to remove as many reads with the settings I tried (subs=5 to match mvicuna and passes=4). On...