Fix/dates v2/symptom extraction addition
Contribution for Marielle Derocher and Zohreh Mahdavi for Deep Learning for Healthcare class.
Our goal was to create a method of training a model to extracting symptoms from MIMIC-III clinical notes.
We added two files:
pyhealth/tasks/symptom_extraction.py
The SymptomExtraction task trains a given a token classification model such as Bio_ClinicalBERT to be able to identify symptoms when given clinical notes.
symptom_extraction_mimic.ipynb
This notebook gives an example of how to using the NOTEEVENTS data from MIMIC-III, identify and tokenize the symptoms using sciscpacy and UMLs data, and use the SymtomExtraction task.
Hey Ammara and Marielle, super cool work!
I think there's something we at PyHealth have done a bad job in explaining, specifically our documentation and tutorials (tbh, that's on me), and if you're interested in expanding your PyHealth PR or modifying it for a merge by us. I'd love if you guys could do the following:
- It seems the symptom extraction "task" is actually a model. if you guys could move it into the models directory, and have it inherit base_model, that would be really cool for us.
- Let me know if you need help with anything. I'd highly recommend looking at our RNN implementation for it.
- Thanks for being one of the groups that was really easy to grade this semester. I hope it hasn't been too bad of an experience at DL4H.
Closing this PR as it lacks proper labeling. Please add appropriate labels and reopen if needed.