RecTools
RecTools copied to clipboard
Metrics. Roc-curve plot and auc added
trafficstars
Implemented roc-curve and auc, but have some questions:
- It seems that roc for k extends plot linearly and connect with (1, 1) like: [https://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/Evaluation.pdf] But usually it is implemented using scores for all items, but it seems that modelBase doesn't do this. If I tried to use k = len(catalog) (for all elements) some of them were discarded (about 30-50% for user) I've tried debugging, but I'm not sure if it's supposed to work like this.
- For AUC I've put scores for not recommended elements as 0 (same reason (1)). So it works like LAUC from article from (1)
- I optimized roc-plot by changing confusion matrix instead of recalculating using make_confusions and went from 15 sec to 1 sec on ratings.dat, but AUC works really slow: more than 30 sec for user. I've tried to use numpy vectorize and broadcasting, but it didn't work because of addressing pandas. I'll try to find a way to do this, but maybe you have suggestions
- I'm not sure how to test the plot, but tested tpr/fpr calculator. Also I added test for one user auc but didn't add test for all users because of speed.
- linter passed flake8, but gives codespelling mistake, because it doesn't recognize fpr. I'm not sure if I had to change it in smth like makefile. Maybe you faced similar problem
ROC-curve screenshot:

- Tests are failed here because it cannot find matplotlib. I'm not sure if I should add it to requirements
- linter found isort problem I fixed it. I'll commit it with next changes
Codecov Report
Merging #35 (32193c3) into main (eee3ba5) will not change coverage. The diff coverage is
100.00%.
@@ Coverage Diff @@
## main #35 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 44 45 +1
Lines 2209 2263 +54
=========================================
+ Hits 2209 2263 +54
| Impacted Files | Coverage Δ | |
|---|---|---|
| rectools/metrics/__init__.py | 100.00% <100.00%> (ø) |
|
| rectools/metrics/roc_auc.py | 100.00% <100.00%> (ø) |