bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

Allow specification of file type for regions-file

Open CreRecombinase opened this issue 4 years ago • 4 comments

I appreciate that regidx.c isn't currently up for the task of reading a bcf, but it's more than a little surprising to see that bcftools merge A.bcf B.bcf -R <(bcftools view -G A.bcf) -Ob C.bcf doesn't give me a C.bcf with the samples of A and B at the sites of A

CreRecombinase avatar Jan 07 '21 21:01 CreRecombinase

I quite like the idea of being able to use VCF/BCF as a list of locations. It makes it more like the unix "join" command.

Meanwhile, have you looked into bcftools isec? I think that followed by a merge on a couple of the output files may do what you want. However it cannot be used in a pipeline as above as it needs an index and it doesn't output to stdout.

jkbonfield avatar Jan 08 '21 09:01 jkbonfield

I agree it would be a nice feature to have. One can obtain a list of sites for use with -R by running

bcftools query -f'%CHROM\t%POS\n' C.bcf | bgzip -c > sites.txt.gz
tabix -s1 -b2 -e2 sites.txt.gz

Why it has not been done is just a matter of time and priorities. Unfortunately there are many other features that would be good to have with no easy workaround.

pd3 avatar Jan 08 '21 10:01 pd3

See also #690, which somewhat similarly asks for --targets-file foo.bcf.

jmarshall avatar Feb 18 '21 13:02 jmarshall

I think one could accomplish bcftools merge A.bcf B.bcf -R <(bcftools view -G A.bcf) -Ob C.bcf with better detection of file type or a way of overriding file type detection (see here) You don't even need to add a bcf reader

CreRecombinase avatar Feb 18 '21 18:02 CreRecombinase