bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

[Feature Request] Cannot sort using custom contig order

Open rickymagner opened this issue 1 year ago • 2 comments

Hi, I have some VCFs where the sequence dictionary in the header is out of the canonical order because of a tool's decisions. I'd like to be able to sort by VCF to follow the "usual" order, in other words sort according to a custom order of the contigs, e.g. the order from a reference fai for example.

Here are some possible ways some new features in bcftools could allow for this.

  • Update bcftools reheader -f ref.fai to also force the new header ordering of the contigs to match the order in the ref.fai. I'd imagine this is the simplest to implement, and then can be followed with a bcftools sort to get the entries to match this ordering, but is not strictly backwards compatible since behavior of an existing flag would change.
  • Update bcftools sort to include a -f ref.fai input to do both of the things described above: update the header to have sequence dict matching the order of the input, and sort all the records according to this order.

Unless I missing something, there is currently no way to (easily) achieve this with bcftools.

rickymagner avatar Nov 26 '24 20:11 rickymagner

I am not opposed to adding this feature, but it is unlikely to happen by my doing. What is the motivation for this request? VCF specification does not mandate any specific order of the contigs, programs should not be relying on it.

pd3 avatar Dec 03 '24 14:12 pd3

The motivation is that some tools write records unsorted, and then you can only sort according to the sequence dictionary in the header using bcftools sort. This means if you want to do anything where you iterate over a family of files (e.g. your VCF, a bed file, a BAM, etc), you'd be unable to traverse them "together" since they would be sorted according to different conventions. It would be great to be able to coerce the ordering in your VCF to match your "normal" convention all your other files are following.

rickymagner avatar Dec 03 '24 15:12 rickymagner