datahub icon indicating copy to clipboard operation
datahub copied to clipboard

strange behavior for gene validator

Open jjgao opened this issue 5 years ago • 7 comments

try PAQR10:

image

image

jjgao avatar Oct 26 '20 22:10 jjgao

It was related to this: https://github.com/cBioPortal/cbioportal/issues/6835

jjgao avatar Oct 28 '20 21:10 jjgao

There are many genes with duplicate symbols: https://docs.google.com/spreadsheets/d/1faa6pufHkwlFFNhSOIE4X-c8Z-rrB2GJof7L_QviXOs/edit#gid=0. We can try to manually replace maybe some as a temp fix while @rmadupuri et al continue the switch to HGNC effort

inodb avatar Oct 28 '20 22:10 inodb

Yichao will review the protein coding genes that have multiple entrez gene IDs and fix them in the database: https://docs.google.com/spreadsheets/d/1faa6pufHkwlFFNhSOIE4X-c8Z-rrB2GJof7L_QviXOs/edit#gid=1770544155

jjgao avatar Nov 03 '20 16:11 jjgao

@jjgao @inodb The results are in this spreadsheet uploaded here. Some entrez ID can be clearly removed but a lot of them are still fuzzy. I posted the specific TODOs and questions for each gene in this sheet. db_duplicate_protein_coding_genes.xlsx

yichaoS avatar Nov 03 '20 18:11 yichaoS

@yichaoS I forgot what we decided. Are we going to manually remove the redundant ones. Or are we going to fix this as the gene data refreshing effort?

jjgao avatar Dec 02 '20 22:12 jjgao

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 02 '21 15:06 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 16 '22 12:04 stale[bot]