circEWS icon indicating copy to clipboard operation
circEWS copied to clipboard

Missing files and errors in `external_validation` with MIMIC-III

Open HarrisonWilde opened this issue 1 year ago • 0 comments

Hi! We are trying to run your code in the external_validation folder. So far come up against two issues, we do not have any of the data in mimic_paths.py in the leomed section. Is this data possible to access or private? Without it we cannot run run_mimic_prep.py. Also, on the current version of MIMIC-III, we get this error on the age calculation:

Traceback (most recent call last):
  File "np_datetime.pyx", line 736, in pandas._libs.tslibs.np_datetime.add_overflowsafe
OverflowError: value too large

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/harryw/eva/circEWS/external_validation/run_mimic_prep.py", line 10, in <module>
    em.build_static_table(version=version)
  File "/home/harryw/eva/circEWS/external_validation/extract_data_from_mimic.py", line 725, in build_static_table
    table["Age"] = (pd.to_datetime(table["ADMITTIME"]) - pd.to_datetime(table["DOB"])) / np.timedelta64(24 * 365, "h")
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/ops/common.py", line 76, in new_method
    return method(self, other)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/arraylike.py", line 194, in __sub__
    return self._arith_method(other, operator.sub)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/series.py", line 6126, in _arith_method
    return base.IndexOpsMixin._arith_method(self, other, op)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/base.py", line 1382, in _arith_method
    result = ops.arithmetic_op(lvalues, rvalues, op)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/ops/array_ops.py", line 273, in arithmetic_op
    res_values = op(left, right)
                 ^^^^^^^^^^^^^^^
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/ops/common.py", line 76, in new_method
    return method(self, other)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/arrays/datetimelike.py", line 1457, in __sub__
    result = self._sub_datetime_arraylike(other)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/arrays/datetimelike.py", line 1154, in _sub_datetime_arraylike
    return self._sub_datetimelike(other)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/arrays/datetimelike.py", line 1169, in _sub_datetimelike
    res_values = add_overflowsafe(self.asi8, np.asarray(-other_i8, dtype="i8"))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "np_datetime.pyx", line 709, in pandas._libs.tslibs.np_datetime.add_overflowsafe
  File "np_datetime.pyx", line 743, in pandas._libs.tslibs.np_datetime.add_overflowsafe
OverflowError: Overflow in int64 addition

This can be fixed by adding the following line to get rid of the dates that are censored down into the 1800s in MIMIC-III:

table = table[pd.to_datetime(table["DOB"]) > pd.Timestamp("1950-01-01")]

But already this means we must have a different cohort which makes reproducing difficult.

Thanks!

HarrisonWilde avatar May 22 '24 17:05 HarrisonWilde