GSoC icon indicating copy to clipboard operation
GSoC copied to clipboard

Study Comparison Support

Open Luke-Sikina opened this issue 3 years ago • 15 comments

Background:

The cBioPortal is an open-access, open-source resource for interactive exploration of multidimensional cancer genomics data sets, which are collected from a multitude of sources such as published research papers, publicly available data repositories, and private data sets. Please refer to the cBioPortal home page for an overview.

The public instance of the cBioPortal hosts hundreds of curated cancer genomics data sourced from public data repositories and published research articles. Many of these studies share data sources, sequencing platforms, gene panels, etc., and being able to compare two studies would be a powerful tool for users looking to understand how studies differ from one another.

Furthermore, the addition of a Cancer Study Comparison tool would useful in other ways as well such as for data curation and comparing mutation and annotation tools used on the same set of data, among many other potential uses.


Goal:

Create an in app tool that allows end users to compare two studies.

The tool should show differences in:

  • Samples / Patients
  • Gene panels
  • Molecular Profiles

Approach:

There should be a backend API that accepts a list of study IDs and returns a structured diff of the requested studies. The backend should be integrated into the existing cBioPortal codebase, and should have the route /api/study_comparison?study_ids=study_a,study_b.

There should be a frontend that consumes that API and presents it. The presentation of the information is up to you. You should try and find a way to categorize the various types of information, so that diffs of different data types don't blend together. Within datatypes, you should reference how established diffing tools display their output when designing your UI.

Resources: Here are some API endpoints that provide information for studies. You shouldn't use these directly, as we want the comparison done on the backend, but you might want to use their underlying service methods when making a comparison endpoint. Running these curls might also give you a better idea of what these different data objects look like.

  • Samples: curl -X GET "https://www.cbioportal.org/api/studies/acc_tcga/samples?direction=ASC&pageNumber=0&pageSize=10000000&projection=SUMMARY" -H "accept: application/json"
  • Patients: curl -X GET "https://www.cbioportal.org/api/studies/acc_tcga/patients?direction=ASC&pageNumber=0&pageSize=10000000&projection=SUMMARY" -H "accept: application/json"
  • Gene Panels: curl -X POST "https://www.cbioportal.org/api/gene-panel-data/fetch" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"molecularProfileIds\": [ \"acc_tcga_rppa\", \"acc_tcga_rna_seq_v2_mrna_median_Zscores\", \"acc_tcga_linear_CNA\" ]}"
  • Molecular Profiles: curl -X GET "https://www.cbioportal.org/api/studies/acc_tcga/molecular-profiles?direction=ASC&pageNumber=0&pageSize=10000000&projection=SUMMARY" -H "accept: application/json"

Codebase When building a REST endpoint in cBioPortal, you need to add a controller method to either an existing class or a make a new controller class. You can look at some of our existing controllers here: https://github.com/cBioPortal/cbioportal/tree/master/web/src/main/java/org/cbioportal/web In general, controllers call service methods. Service methods retrieve data from repository methods and process that data, returning the result to the controller. This means that in addition to making a new controller method, you should plan on adding a new service as well. You can find examples of service classes here: https://github.com/cBioPortal/cbioportal/tree/master/service/src/main/java/org/cbioportal/service/impl


Need skills:

  • Java, JavaScript, SQL
  • General good programming skills and willingness to learn.

Possible mentors:

@Luke-Sikina

Luke-Sikina avatar Feb 22 '22 16:02 Luke-Sikina

@Luke-Sikina What do you mean by data source? Institution? Sequencing platform? Sample source? It's a bit unclear.

ao508 avatar Feb 25 '22 14:02 ao508

Hi! I'm Devansh a student and web developer intern. I have some experience in Reactjs and Java. I would like to work on this task. Is this task still available?

devanshcache avatar Mar 07 '22 23:03 devanshcache

Hey! I am abhi jain a frontend web developer i can code in javascript and reactjs. I would like to work on this issue can you explain it further

abhijain2003 avatar Mar 11 '22 03:03 abhijain2003

Hi @git-devansh! Thank you for reaching out :) I will make sure @Luke-Sikina reaches follows up with you soon about applying.

ao508 avatar Mar 11 '22 15:03 ao508

Hey! I am abhi jain a frontend web developer i can code in javascript and reactjs. I would like to work on this issue can you explain it further

Hi @abhijain2003 Thank you for reaching out :) I will make sure @Luke-Sikina reaches follows up with you soon about applying.

ao508 avatar Mar 11 '22 15:03 ao508

Thanks for this

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows

From: @.> Sent: 11 March 2022 20:40 To: @.> Cc: Abhi @.>; @.> Subject: Re: [cBioPortal/GSoC] Study Comparison Support (Issue #91)

Hi @git-devanshhttps://github.com/git-devansh! Thank you for reaching out :) I will make sure @Luke-Sikinahttps://github.com/Luke-Sikina reaches follows up with you soon about applying.

— Reply to this email directly, view it on GitHubhttps://github.com/cBioPortal/GSoC/issues/91#issuecomment-1065201480, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AV3YHUEWUDEXFGFTHXSCNTLU7NO5VANCNFSM5PB3YZNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you commented.Message ID: @.***>

abhijain2003 avatar Mar 12 '22 03:03 abhijain2003

Hi @git-devansh! Thank you for reaching out :) I will make sure @Luke-Sikina reaches follows up with you soon about applying.

Thank you! Looking forward to it.

devanshcache avatar Mar 12 '22 21:03 devanshcache

I am getting this message twice. But, still sir @Luke-Sikina haven’t reached me yet, please make arrangements for my guidance . I need mentorship.

Thanks for reading.

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows

From: @.> Sent: 13 March 2022 03:10 To: @.> Cc: Abhi @.>; @.> Subject: Re: [cBioPortal/GSoC] Study Comparison Support (Issue #91)

Hi @git-devanshhttps://github.com/git-devansh! Thank you for reaching out :) I will make sure @Luke-Sikinahttps://github.com/Luke-Sikina reaches follows up with you soon about applying.

Thank you! Looking forward to it.

— Reply to this email directly, view it on GitHubhttps://github.com/cBioPortal/GSoC/issues/91#issuecomment-1065969030, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AV3YHUDK5FMKQVNBMBIBBODU7UFLTANCNFSM5PB3YZNQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>

abhijain2003 avatar Mar 13 '22 04:03 abhijain2003

@git-devansh @abhijain2003 Yes this issue is open and we are looking for applicants. If you have any questions, you can ask them here. I'm looking forward to reading your proposals!

Luke-Sikina avatar Mar 14 '22 14:03 Luke-Sikina

Hi, I am Omar
I am interested in applying to this project. Could i ask a question please. are the Samples / Patients, Gene panels and Molecular Profiles stored in a relational database ? Thanks.

OmarAshraf1 avatar Mar 14 '22 18:03 OmarAshraf1

Hi, I am Omar I am interested in applying to this project. Could i ask a question please. are the Samples / Patients, Gene panels and Molecular Profiles stored in a relational database ? Thanks.

Hi Omar,

Great question. Yes, all data is stored in a MySQL 5.7 database. You can find the schema here: https://github.com/cBioPortal/cbioportal/blob/master/db-scripts/src/main/resources/cgds.sql Within the schema here are the tables for the respective objects:

  • Patients: patient
  • Samples: sample
  • Gene Panels: gene_panel
  • Molecular Profiles: genetic_profile

@OmarAshraf1

Luke-Sikina avatar Mar 14 '22 18:03 Luke-Sikina

@Luke-Sikina Thanks a lot Luke. That is great, i understood the project better. I am looking forward to apply and i wish to be a part of this. Thanks.

OmarAshraf1 avatar Mar 14 '22 18:03 OmarAshraf1

Is this issue still open? I want to contribute and wanted to know whether this issue is open for GSOC 2023

MayaSatishRao avatar Jan 21 '23 18:01 MayaSatishRao

@Luke-Sikina I'm a second year bachelor student in CSE and i would love to contribute regarding this issue, please let me know if its still up for contribution.

harsh2929 avatar Mar 12 '23 10:03 harsh2929

Hi @Luke-Sikina, I am just touching base to see if there is still interest in this enhancement. I am interested in studying this issue and make a proposal based on it. But first I'd like to see if there is still movement around here.

Thank you

RINO-GAELICO avatar Feb 29 '24 20:02 RINO-GAELICO