cbioportal
cbioportal copied to clipboard
Support of negative value entrez_gene_id genes in clickhouse table development
Some production databases at MSK, and also if users set up their database with microrna support, or if any phosphorylated genes are added as gene records, then negative values will be assigned to these gene table entries via the function DaoGene.getNextFakeEntrezId() in cbioportal-core repo
see: https://github.com/cBioPortal/cbioportal-core/blob/efcc1d2179e26e289e78a138e6d047c6906e36f4/src/main/java/org/mskcc/cbio/portal/dao/DaoGene.java#L71 https://docs.cbioportal.org/deployment/deploy-without-docker/import-the-seed-database/#download-the-cbioportal-seed-database https://github.com/cBioPortal/cbioportal-core/blob/main/src/main/resources/micrornas.tsv https://github.com/cBioPortal/cbioportal-core/blob/main/src/main/java/org/mskcc/cbio/portal/scripts/ImportMicroRNAIDs.java
The current efforts at developing clickhouse functionality has not yet encountered negatively valued entrez_gene_id records, but this possibility should be covered and tested before the completion and deployment of clickhouse enabled portals.
This effects the clickhouse table construction scripts and possibly also downstream logic in the persistence layer of cBioPortal.
This arises from this line: https://github.com/cBioPortal/cbioportal/blob/79d36e73f1aeff6d0ab4697e77aa210752772ad6/src/main/resources/db-scripts/clickhouse/clickhouse.sql#L49