mimic-code icon indicating copy to clipboard operation
mimic-code copied to clipboard

Error in MIMIC-IV-v2.0 dod record

Open Nicky-Jin opened this issue 1 year ago • 2 comments

Prerequisites

  • [X] Put an X between the brackets on this line if you have done all of the following:
    • Checked the online documentation: https://mimic.mit.edu/
    • Checked that your issue isn't already addressed: https://github.com/MIT-LCP/mimic-code/issues?utf8=%E2%9C%93&q=

Description

Dear Sir or Madam,

I noticed an error in the dod (date of death) record of MIMIC-IV v2.0 recently. I find some dod records had a date before ICU admission. For example, subject 14150625 died on 2137-01-24, but had an ICU admission at 2143-12-20 12:34:58.

I extracted the survive time by:

SELECT
icu.subject_id
, icu.stay_id
, icu.intime
, icu.outtime
, pa.dod
, EXTRACT(EPOCH FROM pa.dod - icu.intime)/(60*60*24) AS survive_time
FROM mimiciv_icu.icustays icu
LEFT JOIN mimiciv_hosp.patients pa 
ON pa.subject_id = icu.subject_id
ORDER BY survive_time

And I find a number of patients had this problem. 图片1

I wish this bug could be fixed soon.

Yours Nicky

Nicky-Jin avatar Jul 22 '22 07:07 Nicky-Jin

The same problem still exists in the MIMIC-IV v2.2

Here is the list of subject_id who have this problem. [10554954, 11042406, 11660628, 12207593, 12376923, 12393516, 12921133, 13078944, 15018122, 15190414, 15246174, 15831207, 15834858, 16467939, 16533974, 17536748, 17631949, 17906419, 17955142, 18839671, 19107535, 19379644, 19547124, 19752788, 19914761]

Hope it can be fixed soon.

Thanks in advance.

Regards, Qingpeng

KimballCai avatar Apr 06 '23 12:04 KimballCai

First, since dod is a date, you do have to be careful of apparent "negative" survival which occurs due to subtracting a date from a datetime, e.g. for subject_id 19936081 and stay_id 39984717, their intime was the 24th, and their dod is the 24th, but their days_survived as you calculate it is -0.72. Instead you can do date arithmetic by converting the intime to a date:

SELECT
icu.subject_id
, icu.stay_id
, icu.intime
, icu.outtime
, pa.dod
, CAST(pa.dod AS DATE) - CAST(icu.intime AS DATE) AS survive_time
FROM mimiciv_icu.icustays icu
LEFT JOIN mimiciv_hosp.patients pa 
ON pa.subject_id = icu.subject_id
ORDER BY survive_time

When we do this calculation, we see:

  • 4 ICU stays have dates of death greater than 1 year before their ICU admission. Clearly erroneous, probably due to the probabilistic linkage (which you can read more about in the paper in the "Out-of-hospital mortality" section - https://www.nature.com/articles/s41597-022-01899-x)
  • 3 ICU stays have a date of death 1 day before their ICU admission. These also look erroneous, but probably the raw date of death data is just documented incorrectly
  • Not shown in the query above, but there's another ~30 or so stays where their dod is on the same date as ICU admission, but occurs a day or two before their outtime

I'd say overall you just have to expect some amount of error when linking 60,000+ records across databases. Since it's a small number, it should be fairly reasonable to investigate their records (either charted or via the note) to figure out exactly what happened, and correct the data in a query. Maybe we could add this column, with the corrections, into the icustay_detail query? https://github.com/MIT-LCP/mimic-code/blob/main/mimic-iv/concepts/demographics/icustay_detail.sql

alistairewj avatar Apr 06 '23 13:04 alistairewj