CommonDataModel
CommonDataModel copied to clipboard
CDM Source
How to populate the cdm_source table if the data is derived from multiple data feeds.
CDM or THEMIS convention?
CDM
Table or Field level?
Table
Is this a general convention?
Yes
Summary of issues
Populating cdm_source if the data is derived from multiple data feeds.
Summary of answer
If a source database is derived from multiple data feeds, the integration of those disparate sources is expected to be documented in the ETL specifications. The source information on each of the databases can be represented as separate records in the CDM_SOURCE table. Currently, there is no mechanism to link individual records in the CDM tables to their source record in the CDM_SOURCE table.
Related links
https://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm:cdm_source
Other comments/notes
NA
@MaximMoinat I would like to discuss this at an upcoming CDM WG meeting. If I am not mistaken, many of our tools expect only one record in the CDM Source table (like ARES Indexer) so I want to make sure we are clear on what the software is doing before we declare this.
@clairblacketer and @MaximMoinat
At the University of Colorado, we insert one record per data source into the CDM_Source table. Each of these data sources have different source release dates. And I know other health systems combine records from different sources.
Let's discuss further in the CDM WG and then make sure we give guidance via CDM requirements and Themis conventions.
@MelaniePhilofsky agreed. The convention makes sense, I just want to make sure we are aligned across the community
@MelaniePhilofsky Interesting, I did not realise this use case for cdm_source. But makes sense, to capture all sources in cdm_source. And although trivial, it is important to define this explicitly because tooling might misbehave otherwise.
Let's discuss in a meeting. I was set on the convention of having only one record in cdm_source, but maybe the tooling should allow this after all.