cbioportal
cbioportal copied to clipboard
Upgrade to new CIViC API
We can do this work as part ICTR set aside funding. Integrate properly in Genome Nexus
TODO: need to make separate epic detailing the set aside work
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hey guys. Just pinging this issue. Would love to see cBioPortal pulling in the most current CIViC data which is only available through V2 API. Let us know if we can help!
Hi, we wanted to ping this issue again to make you aware that the CIViC V1 API will officially be retired on November 1st, 2023. Please let me or @acoffman know if you need any assistance in porting cBioPortal to the V2 API.
@susannasiebert @acoffman We use two endpoints in V1 API, could you tell me which new query should we use to get the corresponding data?
- https://civicdb.org/api/genes/ +
entrez_gene_ids
to get id, name, variants - https://civicdb.org/api/variants/' +
id
to get description and evidence_items Thank you very much!
Hi!
So, the new api is GraphQL based. If you are unfamiliar, you can kind of think of it as similar to SQL. Rather than a series of URL based endpoints (/genes
, /variants
, etc) there is a singular endpoint: https://civicdb.org/api/graphql
. You construct a query, asking for the data you want, and POST
it to that endpoint.
As such, there are not direct 1:1 mappings from the old endpoints to the new ones, however you can achieve similar results. We have a sandbox where you can try queries and browse all the available fields: https://civicdb.org/api/graphiql
As part of CIViC V2 we have updated our data model to support attaching Evidence Items to logical combinations of variants (an example could be BRAF Amplification AND ( BRAF V600E OR BRAF V600K )
. We call these Molecular Profiles.
All Evidence is attached to a Molecular Profile now rather than directly to a Variant. However, every Variant has a Molecular Profile consisting of only itself which you can treat in essentially the same way you were treating Variants before.
As an example of fetching Evidence Items for a given Variant ID, you could do something like this:
{
variant(id: 6) {
id
name
alleleRegistryId
singleVariantMolecularProfile {
description
evidenceItems {
totalCount
pageInfo {
endCursor
hasNextPage
}
nodes {
id
link
evidenceType
evidenceRating
evidenceDirection
evidenceLevel
description
variantOrigin
status
therapies {
id
ncitId
name
}
source {
id
link
name
}
disease {
id
doid
displayName
}
}
}
}
}
}
If you paste that into the sandbox linked above you can see how the response corresponds to the fields you requested, and by browsing the docs linked in the upper right you can see all the fields available.
Currently, there is a way to page through all the genes, or retrieve a gene based on Entrez ID but, I don't believe we have a filter that will take multiple entrez ids at once. If that's something you need for your integration we can add one quickly, just let us know!
We have examples of using the API in both Python and R here: https://github.com/griffithlab/civic-v2/tree/main/examples. The python example includes pagination as well. Also happy to help troubleshoot or answer any additional questions.
Thanks! Adam
@acoffman Thank you so much!
No problem! If you find that you need any additional fields or any other ways to filter the queries, just let us know!
@acoffman Is there a way to send a list of hugo symbols in one query, and get gene and variants information back in response? This is the query I use, but the problem is it only sends one gene at a time so we send too many requests to the server:
query gene(
$entrezSymbol: String,
) {
gene(
entrezSymbol: $entrezSymbol,
) {
id
entrezId
description
link
name
variants {
nodes {
name
id
link
singleVariantMolecularProfile{
description
evidenceItems {
nodes {
id
name
description
evidenceType
evidenceDirection
evidenceLevel
significance
disease {
displayName
name
id
link
}
therapies{
name
id
ncitId
therapyAliases
}
}
}
}
}
}
}
}
Hi @leexgh
We have just pushed a release that includes top level entrezSymbols
and entrezIds
filters for the genes
query.
You can now do something like:
genes(entrezSymbols: ['BRAF', 'EGFR'])
to retrieve multiple genes at once. You can request all the same fields as before on the returned Genes.
Keep in mind that if you request more Genes than the default page size (I believe its 25 in a single request), they could spill over onto multiple pages. You can check that in the pageInfo
block:
pageInfo {
hasNextPage
endCursor
}
If this is the case, you can set up a pretty straightforward while
loop that does something along the lines of while hasNextPage == true
send the same request as before but passing the value of endCursor
to the after
filter.
genes(entrezSymbols: ['BRAF', 'EGFR'], after: "endCursorValue")
Let us know if this helps or if there's anything additional we can do to make your integration easier!
Thanks, Adam
@acoffman Hi Adam, thank you for the updates! It's very helpful! I have a follow-up question about graphql query pagination. Here is my query structure:
query genes($after: String, $entrezSymbols: [String!]) {
genes(after: $after, entrezSymbols: $entrezSymbols) {
pageInfo {
endCursor
hasNextPage
startCursor
hasPreviousPage
}
nodes {
# some fields
variants {
pageInfo {
endCursor
hasNextPage
startCursor
hasPreviousPage
}
nodes {
# some fields
singleVariantMolecularProfile {
# some fields
evidenceItems {
pageInfo {
endCursor
hasNextPage
startCursor
hasPreviousPage
}
nodes {
# some fields
disease {
}
therapies {
}
}
}
}
}
}
}
}
}
As you can see we will give a list of gene symbols, the information we need is:
- annotation for each gene
- for each single gene, we need all variants of this gene
- for each single variant, we need all evidence of this variant
So there will be three levels of paginations in the response (for gene, variant, and evidence respectively). As far as I can see, the page size is set as 50 (or 25? I tested on the civicdb playground and got 50 back) and cannot be overwritten, this means we may need to send up to n^3 (1 gene needs n variants + n*n evidence theoretically in the worst case) follow up queries to get all the data we need.
Do you have any suggestions on handling the fetching of nested queries? We'd like to reduce the number of requests sent to civicdb to make sure both of us can have the best performance.
Thank you very much!
Xiang
@leexgh Do you have a specific example of all the data you would like to display and how you're currently displaying it? On the CIViC website we try to avoid multiple nested queries like this and instead pull back the evidence level data in a separate request that only gets executed when the users wants it, e.g. by utilizing popovers. On pages like the browse tables, where this is unavoidable, we actually use a materialized view to avoid having to execute computationally-expensive, complex queries with multiple nested levels of joining on the fly. You might want to look into the browseGenes query instead of the genes query. This one takes a single entrezSymbol but aggregates the variants as well as the disease and therapy terms for the underlying evidence. So the number of requests here would depend on the number of genes. This query does include complex molecular profiles so that might not be desired on your end.
@susannasiebert We display civicdb data in cbioportal mutations table and copy number alteration table. For example when you hover over the civic icon, there is tooltip popup:
We need to show the gene (PIK3CA) and description (text in first paragragh), variant(E545K) of this gene and description (text in purple box), and count evidence of the variant by type (predictive:30, prognostic: 1).
We usually have hundreds of genes in copy number alteration table, switching to queries that only accept single gene would need hundreds of queries which is not ideal for performance.
Do you think it's possible to have customized page size? I try the first
parameter in genes
query, but it only returns up to 50 records based on my test. It would be helpful if it can accept a larger number.
Any suggestion is appreciated!
@leexgh We can definitely increase the allowable page size up from 50.
Unfortunately, we can't let it be entirely unbounded; because GraphQL lets you define arbitrary queries, if we had no limits on page size, people could write queries that potentially pulled back the entire database at once. While that would be nice, it wouldn't be performant for our servers or users. We will do a little testing on our end and figure out how high we can increase the limit and still maintain acceptable performance. Hopefully we can make it less likely that you'll need to break it up into multiple queries, but you still may need to be aware of that possibility. The hasNextPage
boolean will let you know.
If you need to display the counts of various evidence types in the popover, we can make that directly queryable in the API for you so that you don't have to pull the evidence back and aggregate it yourself.
We probably won't have it done before the Thanksgiving break, but we should be able to get these changes out next week and I will follow up here when we do!
@acoffman Thank you so much!
Hi @acoffman! Do you have any updates about the API? I appreciate any information you can provide.
I'm terribly sorry but our release of the new "evidence counts by type" feature has been delayed on our end. It won't be out until next week. In the meantime you can test out this feature on our staging website (staging.civicdb.org). With this update you should be able to do the following:
query genes($after: String, $entrezSymbols: [String!]) {
genes(after: $after, entrezSymbols: $entrezSymbols) {
pageInfo {
endCursor
hasNextPage
startCursor
hasPreviousPage
}
nodes {
# some fields
variants {
pageInfo {
endCursor
hasNextPage
startCursor
hasPreviousPage
}
nodes {
# some fields
singleVariantMolecularProfile {
# some fields
evidenceCountsByType {
diagnosticCount
predictiveCount
prognosticCount
predisposingCount
oncogenicCount
functionalCount
}
}
}
}
}
}
}
@susannasiebert No problem at all. I appreciate the update. Looking forward to seeing the new feature next week! This will be very helpful for us, thank you very much!
@susannasiebert Happy new year! Hope you had a great holiday time! Just want to check if there is a plan for the new release?
Hi @leexgh,
Thanks for following up with us! A new release is out as of Friday that contains the new query documented above. We will push out an additional release this week that also increases the maximum allowable page size. I'll follow up here when that is deployed as well!
@acoffman Thank you so much! Looking forward to the new release!
Hi @leexgh
This release it out! We have doubled the maximum page size to 100 entries and the fields that Susanna demonstrated here are available for querying.
Thanks! Adam
Hi @acoffman, thank you very much!
@acoffman I found an issue on Genes
query: https://github.com/griffithlab/civic-v2/issues/980. Please let me know if you want me to add more explanation.
Thank you so much @leexgh for the detailed bug report; that made the issue easy to track down.
I have a hotfix going out this afternoon which will resolve it!
@acoffman Thanks for the quick fix!