hts-specs icon indicating copy to clipboard operation
hts-specs copied to clipboard

htsget: add `samples` query parameter, principally to select subset of VCF columns

Open mlin opened this issue 5 years ago • 5 comments

e.g.

GET /htsget/1000genomes/variants?format=VCF&samples=NA12878,NA12877

Previously I circulated a different version of this with a repeated sample=x query parameter. This single, comma separated list is more consistent with the existing fields & tags query parameters.

mlin avatar Jul 10 '19 15:07 mlin

Can a sample name have a comma in it? (@daviesrob)

mlin avatar Jul 10 '19 16:07 mlin

Correct me if wrong, but it is not forbidden for a VCF sample name to contain a comma.

One possibility is to specify URI encoding of each element in the comma-separated list, so any comma within the sample name would be percent-encoded. A comma-separated list of individually URI-encoded elements seems like it would be a slightly tortured construct, though.

Another possibility is reverting to the first straw man idea of providing the list through repeated query parameters, where each individual parameter would then be query string encoded as usual, e.g.

GET /htsget/1000genomes/variants?format=VCF&samples=NA12878&samples=NA12877&samples=quick%2C%20brown%20fox

The wart is that it's dissimilar from the existing fields & tags parameters.

mlin avatar Jul 10 '19 16:07 mlin

Another idea: the default delimiter be comma and then have a parameter to control the delimiter?

nh13 avatar Jul 11 '19 03:07 nh13

@daviesrob suggestion: percent-encoded tab delimiter

mlin avatar Sep 04 '19 16:09 mlin

Also, can sampleID be empty?

yfarjoun avatar Sep 16 '19 15:09 yfarjoun