cbioportal icon indicating copy to clipboard operation
cbioportal copied to clipboard

Clinical Data Counts Issue (Patients are incorrectly counted because case insensitivity)

Open haynescd opened this issue 4 months ago • 1 comments

Study where issue was found https://www.cbioportal.org/study/summary?id=prad_organoids_msk_2022

Specifically the issue was found while looking at the Pie Chart Ethnicity

Patients Related to issue

  • VCAP
  • VCaP
  • LNCAP
  • LNCaP
  • 22RV1
  • 22Rv1
  • PC3
  • PC-3

The Patients above seem to be merged together when counting clinical-data-counts even though they have unique patient internal ids.

The Ethnicity chart shows 5 Caucasians, but there are a total of 6

Found in the RFC80 Effort

SQL used

SELECT ''                                                   AS sample_unique_id,
       concat(cs.cancer_study_identifier, '_', p.stable_id) AS patient_unique_id,
       p.internal_id,
       cam.attr_id                                          AS attribute_name,
       ifNull(clinpat.attr_value, '')                                   AS attribute_value,
       cs.cancer_study_identifier                           AS cancer_study_identifier,
       'patient'                                            AS type
FROM patient AS p
    INNER JOIN cancer_study AS cs ON p.cancer_study_id = cs.cancer_study_id
    FULL OUTER JOIN clinical_attribute_meta AS cam ON cs.cancer_study_id = cam.cancer_study_id
    FULL OUTER JOIN clinical_patient AS clinpat ON (p.internal_id = clinpat.internal_id) AND (clinpat.attr_id = cam.attr_id)
WHERE cam.patient_attribute = 1 and cs.cancer_study_identifier = 'prad_organoids_msk_2022' and attribute_name = 'ETHNICITY';

haynescd avatar Oct 16 '24 16:10 haynescd