CommonDataModel icon indicating copy to clipboard operation
CommonDataModel copied to clipboard

CDM Source

Open MaximMoinat opened this issue 10 months ago • 4 comments

How to populate the cdm_source table if the data is derived from multiple data feeds.

CDM or THEMIS convention?

CDM

Table or Field level?

Table

Is this a general convention?

Yes

Summary of issues

Populating cdm_source if the data is derived from multiple data feeds.

Summary of answer

If a source database is derived from multiple data feeds, the integration of those disparate sources is expected to be documented in the ETL specifications. The source information on each of the databases can be represented as separate records in the CDM_SOURCE table. Currently, there is no mechanism to link individual records in the CDM tables to their source record in the CDM_SOURCE table.

Related links

https://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm:cdm_source

Other comments/notes

NA

MaximMoinat avatar Apr 19 '24 14:04 MaximMoinat

@MaximMoinat I would like to discuss this at an upcoming CDM WG meeting. If I am not mistaken, many of our tools expect only one record in the CDM Source table (like ARES Indexer) so I want to make sure we are clear on what the software is doing before we declare this.

clairblacketer avatar Apr 29 '24 13:04 clairblacketer

@clairblacketer and @MaximMoinat

At the University of Colorado, we insert one record per data source into the CDM_Source table. Each of these data sources have different source release dates. And I know other health systems combine records from different sources.

Let's discuss further in the CDM WG and then make sure we give guidance via CDM requirements and Themis conventions.

MelaniePhilofsky avatar Apr 29 '24 15:04 MelaniePhilofsky

@MelaniePhilofsky agreed. The convention makes sense, I just want to make sure we are aligned across the community

clairblacketer avatar Apr 29 '24 15:04 clairblacketer

@MelaniePhilofsky Interesting, I did not realise this use case for cdm_source. But makes sense, to capture all sources in cdm_source. And although trivial, it is important to define this explicitly because tooling might misbehave otherwise.

Let's discuss in a meeting. I was set on the convention of having only one record in cdm_source, but maybe the tooling should allow this after all.

MaximMoinat avatar Apr 29 '24 19:04 MaximMoinat