DataQualityDashboard
DataQualityDashboard copied to clipboard
Test for Maps to Value relationships
Following query will test Observations. However, the query needs to know the vocabulary of the observation source code. Otherwise I see error in my CDM because ICD10 has some Maps to Value relationships that are not in ICD10CM.
SELECT count(*) AS denominator
, SUM(case WHEN value_as_concept_id = 0 THEN 1 ELSE 0 END) AS num_violated_rows
FROM
(
SELECT observation_source_value, COALESCE(value_as_concept_id, 0) AS value_as_concept_id
, c1.vocabulary_id AS source_vocabulary, c1.concept_name AS source_concept_name
, cMap.concept_name AS mapTo_concept_name, cValue.concept_name AS value_concept_name
from combined_cdm4_18_22_mar_amg_ach.observation
JOIN concept c1 ON c1.concept_code = observation_source_value
JOIN concept_relationship Rmap ON Rmap.relationship_id = 'Maps to' AND Rmap.concept_id_1 = c1.concept_id
JOIN concept CMap ON Cmap.concept_id = Rmap.concept_id_2
JOIN concept_relationship Rvalue ON rValue.relationship_id = 'Maps to value' AND rValue.concept_id_1 =c1.concept_id
JOIN concept cValue on cValue.concept_id = rValue.concept_id_2
WHERE c1.vocabulary_id IN('ICD10CM')
)
@don-torok Could you give a specific example (i.e. an ICD10 code and its value) where you encounter this issue?
-edit- And am I correct in assuming that this checks whether when concept has a 'Maps to Value' relation, then a value_concept_id should be given?
I think this query will show source codes that have the problem:
SELECT distinct concept_code
FROM
( /* All source code that map have a Maps to value relationship */
SELECT c1.concept_code , c1.vocabulary_id, cMap.concept_id AS target_concept , COALESCE(cValue.concept_id, 0) AS value_as_concept_id
FROM concept c1
JOIN concept_relationship Rmap ON Rmap.relationship_id = 'Maps to' AND Rmap.concept_id_1 = c1.concept_id
JOIN concept CMap ON Cmap.concept_id = Rmap.concept_id_2 AND cMap.domain_id IN('Observation', 'Measurement')
JOIN concept_relationship Rvalue ON rValue.relationship_id = 'Maps to value' AND rValue.concept_id_1 =c1.concept_id
JOIN concept cValue on cValue.concept_id = rValue.concept_id_2
) to_value
JOIN
( /* All source codes that do not have a Maps to value rellationship */
SELECT *
FROM
(
SELECT c1.concept_code, c1.vocabulary_id , cMap.concept_id AS target_concept , COALESCE(cValue.concept_id, 0) AS value_as_concept_id
FROM concept c1
JOIN concept_relationship Rmap ON Rmap.relationship_id = 'Maps to' AND Rmap.concept_id_1 = c1.concept_id
JOIN concept CMap ON Cmap.concept_id = Rmap.concept_id_2 AND cMap.domain_id IN('Observation', 'Measurement')
LEFT OUTER JOIN concept_relationship Rvalue ON rValue.relationship_id = 'Maps to value' AND rValue.concept_id_1 =c1.concept_id
LEFT OUTER JOIN concept cValue on cValue.concept_id = rValue.concept_id_2
) WHERE value_as_concept_id = 0
) no_value USING( concept_code )
This will flush out examples
SELECT DISTINCT c1.concept_code, c1.vocabulary_id , c1.concept_name, cMap.concept_id AS target_concept, cMap.concept_name , COALESCE(cValue.concept_id, 0) AS value_as_concept_id, cValue.concept_name
FROM concept c1
JOIN concept_relationship Rmap ON Rmap.relationship_id = 'Maps to' AND Rmap.concept_id_1 = c1.concept_id
JOIN concept CMap ON Cmap.concept_id = Rmap.concept_id_2
LEFT OUTER JOIN concept_relationship Rvalue ON rValue.relationship_id = 'Maps to value' AND rValue.concept_id_1 =c1.concept_id
LEFT OUTER JOIN concept cValue on cValue.concept_id = rValue.concept_id_2
WHERE c1.concept_code IN( 'Y63' , 'V16.5', 'O32.3', 'V28.9', 'V28.9' )
ORDER BY 1
A refinement to the query in my initial issue is to add the Observation_Concept_id to the join criteria so that it would be
JOIN concept CMap ON Cmap.concept_id = Rmap.concept_id_2 AND Cmap.concept_id = Observation_Concept_id
However I think there are a few source codes, such as 'Y63' that would still cause a problem.
related to #148