gatk
gatk copied to clipboard
Adding VCF index argument to SelectVariants
Feature request
Tool(s) or class(es) involved
SelectVariants
Description
In order to run SelectVariants with VCF inputs that are in separate locations from their index files or to stream SelectVariants using https from Azure blob storage, we need a way to provide the index file in a separate argument from the -V input. @jamesemery started thinking this through (copying this from slack):
In
featureDataSource.getTribbleFeatureReader()we currently initialize the datasources ingetFeatureReader()which gets called byVariantWalker.initializeDrivingVariants(). You could stick an override into that where you thread down the path for the index source through that path and optionally (only if the index is explicitly supplied by the user) push it down into thegetTribbleFeatureReader()calls at the bottom of the stack there.
@droazen any thoughts on this? @VJalili Would adding this feature to SelectVariants be useful for your pipelines at all?
@meganshand We would like to add in a global mechanism for passing in explicit VCF indices, which we have long had for BAM/CRAM indices. @ldgauthier has requested this many times as well. We have a mechanism implemented in HTSJDK that allows you to pass in a JSON file (which we call a Bundle) containing URLs to both the main file and companion index, wherever you can currently pass in a raw file URL. Would this meet your needs?
@droazen Yes, that sounds very convenient to use.
Hi, I would like to be able to pass fasta index files in separate locations for HaplotypeCaller as well. Is there currently an option for that? I see that --read-index is for passing .bai explicitly but do not see any such flag for .fai.