mehari icon indicating copy to clipboard operation
mehari copied to clipboard

Genome builds have no unified format

Open gromdimon opened this issue 1 year ago • 4 comments

Describe the bug Currently genome builds have different "names"/"labels" in different APIs. For example:

https://reev.cubi.bihealth.org/internal/proxy/mehari/genes/txs?hgncId=HGNC:4806&genomeBuild=GENOME_BUILD_GRCH37&pageSize=1000

https://reev.cubi.bihealth.org/internal/proxy/mehari/seqvars/csq?genome_release=grch37&chromosome=6&position=24302274&reference=T&alternative=C

To Reproduce N/A

Expected behavior There should be one standard for genome builds across the mehari APIs.

Additional context N/A

gromdimon avatar Mar 15 '24 13:03 gromdimon

Issue is that currently the type definition for the seqvar/structvar endpoints are defined in rust, whereas the tx endpoint is directly defined in protobuf.

All exposed API endpoints should be defined consistently. Putting query definitions in protobuf might have the nice effect for allowing non-http APIs, but might be less easy to use in rust.

xiamaz avatar Mar 29 '24 15:03 xiamaz

Currently prost is used to interface with protobuf defined structures, whereas everything else uses serde. Both are unfortunately not compatible.

As a general design, moving everything i/o facing into protobuf might make sense.

xiamaz avatar Mar 29 '24 15:03 xiamaz

Action plan

  • define all request/response types for seqvar/structvar in protobuf
  • coordinate change with tools depending on proper mehari server functionality (currently reev and varfish)

xiamaz avatar Mar 29 '24 15:03 xiamaz

With utoipa, openAPI schema and corresponding new api/v1/…/ endpoints, the genes/txs endpoint is now api/v1/genes/transcripts and uses genome_build=grch37 instead of genomeBuild=GENOME_BUILD_GRCH37 as parameters. However, it still says genome_build in contrast to genome_release used by the other endpoints.

tedil avatar Jan 03 '25 11:01 tedil