eds-scikit
eds-scikit copied to clipboard
Errors when running `introduction.ipynb`
When running codes from A gentle demo section in documentation, some commands return errors (probably originating from small syntax changes) using version 0.1.6.
Description
- In section section "Extracting diabetes status", the following command does not output the same result than in documentation
diabetes.concept.value_counts()
Discrepancy solved in my case by replacing concept
by value
column
- In section "Extracting covid status", the code cell below returns a KeyError: 'code_list' arising from line 81 in
event_from_code
function
codes = dict(
COVID=dict(
code_list=r"U071[0145]",
code_type="regex",
)
)
covid = conditions_from_icd10(
condition_occurrence=data.condition_occurrence,
visit_occurrence=data.visit_occurrence,
codes=codes,
date_min=DATE_MIN,
date_max=DATE_MAX,
)
Changing the dictionary in the following way solved the issue in my case :
codes = dict(
COVID=dict(
regex=r"U071[0145]",
)
)
- In section "Adding patient age", the following error is raised when trying to compute patient age
TypeError: One of the provided Serie isn't a datetime Serie
A solution in my case was to convert, birth_datetime
to datetime format using the following command :
visit_detail_covid["birth_datetime"].apply(lambda x:pd.to_datetime(x))
I guess the issue might be coming from the i2b2 connector
How to reproduce the bug
Code to load an i2b2 database (common for the 3 bugs) :
import eds_scikit
import datetime
from eds_scikit.io import HiveData
database_name = "cse_**"
data = HiveData(
database_name=database_name,
database_type="I2B2"
)
DATE_MIN = datetime.datetime(2018, 1, 1)
DATE_MAX = datetime.datetime(2019, 6, 1)
Minimal code for bug 1 :
from eds_scikit.event.diabetes import diabetes_from_icd10
diabetes = diabetes_from_icd10(
condition_occurrence=data.condition_occurrence,
visit_occurrence=data.visit_occurrence,
date_min=DATE_MIN,
date_max=DATE_MAX,
)
diabetes.concept.value_counts()
Minimal code for bug 2 :
from eds_scikit.event import conditions_from_icd10
codes = dict(
COVID=dict(
code_list=r"U071[0145]",
code_type="regex",
)
)
covid = conditions_from_icd10(
condition_occurrence=data.condition_occurrence,
visit_occurrence=data.visit_occurrence,
codes=codes,
date_min=DATE_MIN,
date_max=DATE_MAX,
)
Minimal code for bug 3 :
from eds_scikit.event import conditions_from_icd10
from eds_scikit.utils import datetime_helpers
codes = dict(
COVID=dict(
regex=r"U071[0145]",
)
)
covid = conditions_from_icd10(
condition_occurrence=data.condition_occurrence,
visit_occurrence=data.visit_occurrence,
codes=codes,
date_min=DATE_MIN,
date_max=DATE_MAX,
)
visit_detail_covid = data.visit_detail.merge(
covid[["visit_occurrence_id"]],
on="visit_occurrence_id",
how="inner",
)
visit_detail_covid = visit_detail_covid.merge(data.person[['person_id','birth_datetime']],
on='person_id',
how='inner')
visit_detail_covid["age"] = (
datetime_helpers.substract_datetime(
visit_detail_covid["visit_detail_start_datetime"],
visit_detail_covid["birth_datetime"],
out="hours",
)
/ (24 * 365.25)
)