jbrowse
jbrowse copied to clipboard
Consider support for pairix indexed bgzip files
Consider implementing support for pairix indexed bgzip files:
https://github.com/4dn-dcic/pairix#pairix
This essentially allows bedpe files to be 2D indexed.
This is likely a solution to any of the arc/rearrangement tracks (provided convered to bedpe input).
This would work well for structural rearrangements as an arc can indicate an imprecise junction at one (or both ends) by the width and it can ease burden of overlapping data being loaded. Rearrangements to different chromosomes can be clearly marked also.
Thanks for suggesting this. Indeed, I asked them if they have a file format specification available but didn't get a reply yet https://github.com/4dn-dcic/pairix/issues/60
Could involve reading the source code, or we could consider alternatives to pairix (E.g. the .hic format is pairwise I think and has a node module available for it https://github.com/igvteam/hic-straw, bedInteract from UCSC is another possible alternative but I don't know if it is a "real" 2D index)
Finally, just to add to the brainstorm, we also want to implement the VCF breakend spec, which is pairwise by nature (but actually, it is sort of more than pairwise, since it can integrate multiple pairwise things in a single "event").
We have lots of files for the VCF break-end spec, I think I've mentioned in the past.
Mentioned this as I've been working with some Hi-C protocol stuff and scientists will find bedped+pairix easier to work with than hic and cooler.
Would you be able to share any of these VCF breakend files (or your proposed BEDPE files even)? @rbuels is actively looking for some data for testing
The VCF has a comparable bedpe (we generate both).
See this archive:
ftp://ftp.sanger.ac.uk/pub/cancer/dockstore/expected/dockstore-cgpwgs-expected.tar.gz
Within that:
WGS_COLO-829_vs_COLO-829-BL/brass/COLO-829_vs_COLO-829-BL.annot.vcf.gz
WGS_COLO-829_vs_COLO-829-BL/brass/COLO-829_vs_COLO-829-BL.annot.bedpe.gz
Human GRCh37 (no chr prefix)
Hi folks, the pairix spec is available on the repo now 4dn-dcic/pairix#60
@SooLee thank you! I will check it out