circEWS
circEWS copied to clipboard
Missing files and errors in `external_validation` with MIMIC-III
Hi! We are trying to run your code in the external_validation folder. So far come up against two issues, we do not have any of the data in mimic_paths.py in the leomed section. Is this data possible to access or private? Without it we cannot run run_mimic_prep.py. Also, on the current version of MIMIC-III, we get this error on the age calculation:
Traceback (most recent call last):
File "np_datetime.pyx", line 736, in pandas._libs.tslibs.np_datetime.add_overflowsafe
OverflowError: value too large
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/harryw/eva/circEWS/external_validation/run_mimic_prep.py", line 10, in <module>
em.build_static_table(version=version)
File "/home/harryw/eva/circEWS/external_validation/extract_data_from_mimic.py", line 725, in build_static_table
table["Age"] = (pd.to_datetime(table["ADMITTIME"]) - pd.to_datetime(table["DOB"])) / np.timedelta64(24 * 365, "h")
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/ops/common.py", line 76, in new_method
return method(self, other)
^^^^^^^^^^^^^^^^^^^
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/arraylike.py", line 194, in __sub__
return self._arith_method(other, operator.sub)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/series.py", line 6126, in _arith_method
return base.IndexOpsMixin._arith_method(self, other, op)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/base.py", line 1382, in _arith_method
result = ops.arithmetic_op(lvalues, rvalues, op)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/ops/array_ops.py", line 273, in arithmetic_op
res_values = op(left, right)
^^^^^^^^^^^^^^^
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/ops/common.py", line 76, in new_method
return method(self, other)
^^^^^^^^^^^^^^^^^^^
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/arrays/datetimelike.py", line 1457, in __sub__
result = self._sub_datetime_arraylike(other)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/arrays/datetimelike.py", line 1154, in _sub_datetime_arraylike
return self._sub_datetimelike(other)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/harryw/.conda/envs/eva/lib/python3.12/site-packages/pandas/core/arrays/datetimelike.py", line 1169, in _sub_datetimelike
res_values = add_overflowsafe(self.asi8, np.asarray(-other_i8, dtype="i8"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "np_datetime.pyx", line 709, in pandas._libs.tslibs.np_datetime.add_overflowsafe
File "np_datetime.pyx", line 743, in pandas._libs.tslibs.np_datetime.add_overflowsafe
OverflowError: Overflow in int64 addition
This can be fixed by adding the following line to get rid of the dates that are censored down into the 1800s in MIMIC-III:
table = table[pd.to_datetime(table["DOB"]) > pd.Timestamp("1950-01-01")]
But already this means we must have a different cohort which makes reproducing difficult.
Thanks!