Matt Bookman

Results 13 issues of Matt Bookman

For loading VCF data into BigQuery, Variant Transforms uses Cloud Dataflow. Dataflow now provides a flag which can be passed that brings down the cost of using Dataflow and has...

Output from joint genotyping is a nice squared off matrix of variant rows and callset (sample) columns. It is typically the case that around 90% of such data is high...

When a `--representative_header_file` is specified to `vcf_to_bq`, the merge_header pipeline shouldn't need to run, but it always does. It looks like the necessary change is to add a check in...

It would be great to be able to `----run_annotation_pipeline` as a standalone step. Some users might not want to upload to BigQuery or you might want to operationally separate the...

I noticed that I was never getting Pipeline logs for my `vcf_to_bq` jobs. I would see in the output things like: ``` Output will be written to "''/runner_logs_20190716_044043.log" ``` Rather...

This does not appear to be a requirement from the VCF spec, but may be worthwhile to at least follow as a convention. Looking at the output of a joint...

I am using rc5 and running the GDC's transform.cwl with rabix: https://github.com/NCI-GDC/gdc-dnaseq-cwl/blob/master/workflows/dnaseq/transform.cwl ``` $ ./rabix/rabix --basedir ./out gdc-dnaseq-cwl/workflows/dnaseq/transform.cwl gdc-dnaseq-cwl/workflows/dnaseq/NA12878.chrom20.ILLUMINA.bwa.CEU.low_coverage.20121211.json [2017-05-04 20:34:58.521] [INFO] Job root.samtools_bamtobam has started [2017-05-04 20:34:58.585] [INFO] Pulling...

bug

IGV.js does not support CRAMs directly, but does support htsget. It would be great if the htsget server could support CRAMs. I know this is listed in the README: CRAM...

As noted here: https://cloud.google.com/container-registry/docs/#pushing_to_the_registry > If your project ID has the form example.com:foo-bar, with Docker 1.8+ use: > > gcr.io/example.com/foo-bar/... To support domain-scoped projects, we should sweep the examples and...