RecTools icon indicating copy to clipboard operation
RecTools copied to clipboard

Metrics. Roc-curve plot and auc added

Open jegorus opened this issue 2 years ago • 2 comments
trafficstars

Implemented roc-curve and auc, but have some questions:

  1. It seems that roc for k extends plot linearly and connect with (1, 1) like: [https://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/Evaluation.pdf] But usually it is implemented using scores for all items, but it seems that modelBase doesn't do this. If I tried to use k = len(catalog) (for all elements) some of them were discarded (about 30-50% for user) I've tried debugging, but I'm not sure if it's supposed to work like this.
  2. For AUC I've put scores for not recommended elements as 0 (same reason (1)). So it works like LAUC from article from (1)
  3. I optimized roc-plot by changing confusion matrix instead of recalculating using make_confusions and went from 15 sec to 1 sec on ratings.dat, but AUC works really slow: more than 30 sec for user. I've tried to use numpy vectorize and broadcasting, but it didn't work because of addressing pandas. I'll try to find a way to do this, but maybe you have suggestions
  4. I'm not sure how to test the plot, but tested tpr/fpr calculator. Also I added test for one user auc but didn't add test for all users because of speed.
  5. linter passed flake8, but gives codespelling mistake, because it doesn't recognize fpr. I'm not sure if I had to change it in smth like makefile. Maybe you faced similar problem

ROC-curve screenshot: ROC-curve screenshot:

jegorus avatar Mar 30 '23 18:03 jegorus

  1. Tests are failed here because it cannot find matplotlib. I'm not sure if I should add it to requirements
  2. linter found isort problem I fixed it. I'll commit it with next changes

jegorus avatar Mar 30 '23 18:03 jegorus

Codecov Report

Merging #35 (32193c3) into main (eee3ba5) will not change coverage. The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main       #35   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           44        45    +1     
  Lines         2209      2263   +54     
=========================================
+ Hits          2209      2263   +54     
Impacted Files Coverage Δ
rectools/metrics/__init__.py 100.00% <100.00%> (ø)
rectools/metrics/roc_auc.py 100.00% <100.00%> (ø)

codecov[bot] avatar Apr 13 '23 18:04 codecov[bot]