justcause
justcause copied to clipboard
Implement R-Pol policy risk score and ROC AUC score
Some datasets cannot be evaluated using the currently used scores PEHE or ENoRMSE, because
- No ground truth is available (e.g. the Jobs dataset from Lalonde)
- The classes are imbalanced and binary (e.g. the Twins dataset)
Thus, we need more scores for comprehensive evaluation. Especially the policy risk used, for example, by Shalit et al. Also, the ROC-curve or the area-under-the-curve (AUC) of the ROC-Curve should be used in binary cases like the wins dataset.