tabix icon indicating copy to clipboard operation
tabix copied to clipboard

feature request: tabix with space delimiter

Open winni2k opened this issue 11 years ago • 2 comments

I would like to index a file in WTCCC haps format so that I can pull out regions of interest. It strikes me that bgzip and tabix would work on this if the file was tab instead of space delimited. Before I go off and replace all the spaces with tabs, I was wondering how hard it would be to implement a run time or even compile time option to bgzip and tabix that allows for other delimiters other than tab.

winni2k avatar Mar 04 '14 13:03 winni2k

As is implied by the name of "tabix", the manual page and the paper, tabix only works with TAB-delimited formats. You can convert other delimiter easily with tr " " "\t" < input.txt.

That said, it is fine to add a new command line switch to optionally identify fields by space, but this should not be the default behavior. Multiple TAB-delimited formats permit spaces in each field.

lh3 avatar Mar 04 '14 16:03 lh3

We could in principle allow arbitrary delimiters. It would make sense to store this with the other information in the header of .tbi file, but that would break backward compatibility. I am not convinced it is worth it.

pd3 avatar Mar 06 '14 16:03 pd3