cehr-bert
cehr-bert copied to clipboard
Update age normalization method in frequency based model evaluator
The current normalization method for age is applied on the entire dataset before splitting up which might cause age information leakage for the future so it isn't aligned with the best practice. The method needs to be updated to make the evaluations fair across train/test/validation sets.
For frequency baseline models, we need to "STOP" normalizing age in the corresponding evaluators where we process the data for evaluation.
https://github.com/cumc-dbmi/cehr-bert/blob/8be39f18cfbfba0f3905110bdf6a2e0fa289ff08/evaluations/model_evaluators.py#L406