bdl-benchmarks
bdl-benchmarks copied to clipboard
Plots showing results in Colab notebook display metrics on different scales.
I could be wrong, but this plot for example
is clearly suspect since the naive method is giving an AUC around 0 and not 50. I believe the issue is that the results
dictionary gives the metrics on a scale from [0,1], while the baseline models give the metrics on a scale of [0,100]. I can't actually find in the code where these baseline results are scaled by 100, or else I would happily make a pull request. In general, I think it would be easier to keep everything on a scale of [0,1]