mimic-code icon indicating copy to clipboard operation
mimic-code copied to clipboard

Deidentification of neurological assessment scores in notes (MIMIC-III, MIMIC-IV)

Open tilmanbeck opened this issue 1 year ago • 0 comments

Prerequisites

  • [X] Put an X between the brackets on this line if you have done all of the following:
    • Checked the online documentation: https://mimic.mit.edu/
    • Checked that your issue isn't already addressed: https://github.com/MIT-LCP/mimic-code/issues?utf8=%E2%9C%93&q=

Description

Hi, in our project we are looking at patients with subarachnoid hemorrhage (SAH) diagnosis; such patients often undergo neurological assessment which includes grading scores such as Hunt and Hess, WFNS, or (Modified) Fisher scale. Such scores are often gathered during admission and reported in the discharge summary. Extracting these scores from free-text notes can be useful for downstream applications.

It seems that the description of these scores in the notes is masked. For example in MIMIC-III, the TEXT field of the entry with HADM_ID=167857 and CATEGORY="Discharge summary" in NOTEEVENTS.csv.gz has the Hess part of Hunt and Hess masked. Further, the subsequent score name in the same entry is completely masked, making it impossible to recover. In MIMIC-IV, a similar phenomena can be observed, albeit slightly different. For the text field in entry in mimic-iv-note/2.2/note/discharge.csv.gz with note_id=13317644-DS-20, both Hunt and Hess are masked whereas Fisher is not masked.

I wonder if the context-specific rules can be added to the deidentification algorithm, similarly as suggested in #1507 ?

Thanks a lot for your efforts of maintaining and further developing the MIMIC database, it is a great resource!

Best, Tilman

tilmanbeck avatar Sep 18 '24 12:09 tilmanbeck