cbioportal icon indicating copy to clipboard operation
cbioportal copied to clipboard

Support of negative value entrez_gene_id genes in clickhouse table development

Open sheridancbio opened this issue 8 months ago • 1 comments

Some production databases at MSK, and also if users set up their database with microrna support, or if any phosphorylated genes are added as gene records, then negative values will be assigned to these gene table entries via the function DaoGene.getNextFakeEntrezId() in cbioportal-core repo

see: https://github.com/cBioPortal/cbioportal-core/blob/efcc1d2179e26e289e78a138e6d047c6906e36f4/src/main/java/org/mskcc/cbio/portal/dao/DaoGene.java#L71 https://docs.cbioportal.org/deployment/deploy-without-docker/import-the-seed-database/#download-the-cbioportal-seed-database https://github.com/cBioPortal/cbioportal-core/blob/main/src/main/resources/micrornas.tsv https://github.com/cBioPortal/cbioportal-core/blob/main/src/main/java/org/mskcc/cbio/portal/scripts/ImportMicroRNAIDs.java

The current efforts at developing clickhouse functionality has not yet encountered negatively valued entrez_gene_id records, but this possibility should be covered and tested before the completion and deployment of clickhouse enabled portals.

This effects the clickhouse table construction scripts and possibly also downstream logic in the persistence layer of cBioPortal.

This arises from this line: https://github.com/cBioPortal/cbioportal/blob/79d36e73f1aeff6d0ab4697e77aa210752772ad6/src/main/resources/db-scripts/clickhouse/clickhouse.sql#L49

sheridancbio avatar Jun 27 '24 21:06 sheridancbio