GAMENet
GAMENet copied to clipboard
about reproduce the performance
i want to reproduce LEAP and GAMENet, my scripts: %run train_Leap.py %run train_Leap.py --resume_path final.model --eval
%run train_GAMENet.py --model_name GAMENet-ddi --ddi %run train_GAMENet.py --model_name GAMENet-ddi --ddi --resume_path final.model --eval
i choose final.model to do the test predict. Here is my performances:
- LEAP重现Test DDI Rate: 0.0699, Jaccard: 0.4438, PRAUC: 0.6362, AVG_PRC: 0.6364, AVG_RECALL: 0.6158, AVG_F1: 0.6064 avg med 19.
- GAMENET Tets(no DDI) DDI Rate: 0.0778, Jaccard: 0.5151, PRAUC: 0.7644, AVG_PRC: 0.6748, AVG_RECALL: 0.6943, AVG_F1: 0.6704
- GAMENET Tets(with DDI)DDI Rate: 0.0775, Jaccard: 0.5153, PRAUC: 0.7678, AVG_PRC: 0.6765, AVG_RECALL: 0.6923, AVG_F1: 0.6705
They are not same as your paper, but have the similar trend, is it normal or have some problems?
Wait for your responses, Thank you!
Thanks for pointing out the problem! This problem is caused by the mismatch between the training data in the Github repository and the training data in our paper.
Way to fix this problem.
If you re-generate the training data using EDA.ipynb, it will provide you with the same data in our paper.
Why higher performance?
The data in the GitHub repository may also be generated by uncommenting a line in EDA.ipynb:
#med_pd = filter_first24hour_med(med_pd)
It means that the medications for a visit are collected beyond 1 day. Thus, the performance may be improved due to more data provided for a patient. But as we have claimed in our paper, the first 24-hour is often the most critical time for patients to obtain correct treatment quickly. However, it depends on you to choose which version of data in your work.
I tried running EDA.ipynb but getting this error 'ndc_rxnorm_file' is not defined. Upon further investigation I figured out there is a mismatch in file name. Please note that.