clinical-outcome-prediction
clinical-outcome-prediction copied to clipboard
Code for the EACL 2021 Paper: Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration
Clinical Outcome Prediction from Admission Notes
This repository contains source code for the task creation and experiments from our paper Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration, EACL 2021.
Use the CORe Model
To apply the CORe model - pre-trained on clinical outcomes - on downstream tasks, simply load it from huggingface's model hub.
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bvanaken/CORe-clinical-outcome-biobert-v1")
model = AutoModel.from_pretrained("bvanaken/CORe-clinical-outcome-biobert-v1")
Create Admission Notes for Outcome Prediction from MIMIC-III
Install Requirements:
pip install -r tasks/requirements.txt
Create train/val/test for e.g. Mortality Prediction:
python tasks/mp/mp.py \
--mimic_dir {MIMIC_DIR} \ # required
--save_dir {DIR_TO_SAVE_DATA} \ # required
--admission_only True \ # required
mimic_dir: Directory that contains unpacked NOTEEVENTS.csv, ADMISSIONS.csv, DIAGNOSES_ICD.csv and PROCEDURES_ICD.csv
save_dir: Any directory to save the data
admission_only: True=Create simulated Admission Notes, False=Keep complete Discharge Summaries
Apply these scripts accordingly for the other outcome tasks:
Length-of-Stay (los/los.py),
Diagnoses (dia/dia.py),
Diagnoses + ICD+ (dia/dia_plus.py),
Procedures (pro/pro.py) and
Procedures + ICD+ (pro/pro_plus.py)
Train Outcome Prediction Tasks
1 - Build using Docker: Dockerfile
2 - Create Config File. See Example for Mortality Prediction: MP Example Config
3 - Run Training with Arguments
python doc_classification.py \
--task_config {PATH_TO_TASK_CONFIG.yaml} \ # required
--model_name_or_path {PATH_TO_MODEL_OR_TRANSFORMERS_MODEL_HUB_NAME} \ # required
--cache_dir {CACHE_DIR} \ # required
See doc_classification.py for optional parameters.
4 - Run Training with Hyperparameter Optimization
python hpo_doc_classification.py \
# Same parameters as above plus the following:
--hpo_samples {NO_OF_SAMPLES} \ # required
--hpo_gpus {NO_OF_GPUS} \ # required
Cite
@inproceedings{vanAken2021,
author = {Betty van Aken and
Jens-Michalis Papaioannou and
Manuel Mayrdorfer and
Klemens Budde and
Felix A. Gers and
Alexander Löser},
title = {Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration},
booktitle = {Proceedings of the 16th Conference of the European Chapter of the
Association for Computational Linguistics: Main Volume, {EACL} 2021,
Online, April 19 - 23, 2021},
pages = {881--893},
publisher = {Association for Computational Linguistics},
year = {2021},
url = {https://www.aclweb.org/anthology/2021.eacl-main.75/}
}