guacamole
guacamole copied to clipboard
write out a VCF file instead of a VCF directory with a part-0000 file
Bumping this.
I'd like to be able to directly load a VCF generated by guacamole into varcode, and be able to run guacamole repeatedly and have it just overwrite the output file.
Even very large VCFs should be easy to write by one node. We should just collect all the variants called onto one node (perhaps the driver) and write out a regular VCF.
ADAM has recently added functionality for "easily" writing single-file VCFs and SAMs/BAMs; could be useful here, or just useful for us to mimic. https://github.com/bigdatagenomics/adam/pull/689