cehr-bert icon indicating copy to clipboard operation
cehr-bert copied to clipboard

Update age normalization method in frequency based model evaluator

Open xj2193 opened this issue 2 years ago • 0 comments

The current normalization method for age is applied on the entire dataset before splitting up which might cause age information leakage for the future so it isn't aligned with the best practice. The method needs to be updated to make the evaluations fair across train/test/validation sets.

For frequency baseline models, we need to "STOP" normalizing age in the corresponding evaluators where we process the data for evaluation.

https://github.com/cumc-dbmi/cehr-bert/blob/8be39f18cfbfba0f3905110bdf6a2e0fa289ff08/evaluations/model_evaluators.py#L406

xj2193 avatar Feb 17 '23 20:02 xj2193