bam-js
bam-js copied to clipboard
Htsget unable to fetch header from some endpoints
Ref ga4gh endpoint here
https://github.com/samtools/hts-specs/pull/530
Rejects the bogus refname provided that we use currently
Have to adjust to not specify any refname, but then it returns very large data chunks, and we have to range-request the results of what it gives back
I suspect that's partly being tackled by @jb-adams in https://github.com/ga4gh/htsget-refserver/pull/8 ?
Could be! On some level, I think this code should figure out how to be more like samtools and figure it out but I'll certainly check the pr especially if it is deployed somewhere
For dnanexus's webserver, we request a bogus refname because otherwise it says the "header" involves a download of 10GB of data, and we don't try to "subselect" the range that it gives us
We could consider dropping support for dnanexus's htsget server so that ga4gh's htsget server works, or we find a fix that accomodates both, or just leave as is
See the behavior of the dnanexus server here
#range is the entire file, e.g. 140gb, which our code doesn't currently try to subselect from resulting in bad behavior if used
http://htsnexus.rnd.dnanex.us/v1/reads/BroadHiSeqX_b37/NA12878?class=header
#reasonable size, all data encoded in a data uri even
http://htsnexus.rnd.dnanex.us/v1/reads/BroadHiSeqX_b37/NA12878?class=header&referenceName=DOES_NOT_EXIST
IIRC the htsnexus htsget server might not as up to date as the GA4GH reference htsget server? Please refer to the official public GA4GH server endpoints mentioned in here:
https://github.com/igvteam/igv.js/issues/1187#issuecomment-858314458
So yes, I'd consider dropping support for previous spec versions, tbh.
/cc @mlin @ohofmann
Ya that was the impetus for the comment. However, my workaround to work with the dnanexus server (to add a random referenceName to the class=header request) does not work with the ga4gh server. I kind of figured the hacky behavior to add the random refname wouldn't be great but I got to figure out what to do next
That page is now gone along with the deprecated endpoints. I was about to suggest using the official GA4GH htsget endpoint, but it seems to be undergoing some issues for a couple of weeks now?:
/cc @jb-adams can you tilt that one back up please? /cc @victorskl @andrewpatto