responsible-ai-toolbox
responsible-ai-toolbox copied to clipboard
Sort individual feature importance by residual for regression
This PR orders the samples in individual feature importance by abs(true_y - predicted_y) for regression. It will show the predictions which are better first than the prediction which are way off in the model.
Description
Before: sort by index

After: sort by abs(true_y - predicted_y)

Checklist
- [x] I have added screenshots above for all UI changes.
- [ ] I have added e2e tests for all UI changes.
- [ ] Documentation was updated if it was needed.
Codecov Report
Merging #1487 (9342c46) into main (9342c46) will not change coverage. The diff coverage is
n/a.
:exclamation: Current head 9342c46 differs from pull request most recent head 45455bf. Consider uploading reports for the commit 45455bf to get more accurate results
@@ Coverage Diff @@
## main #1487 +/- ##
=======================================
Coverage 89.30% 89.30%
=======================================
Files 38 38
Lines 1617 1617
=======================================
Hits 1444 1444
Misses 173 173
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 89.30% <0.00%> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 9342c46...45455bf. Read the comment docs.
https://responsibleai.blob.core.windows.net/pullrequest/microsoft/responsible-ai-toolbox/tongy/sortByResidual/dashboard/index.html
https://responsibleai.blob.core.windows.net/pullrequest/microsoft/responsible-ai-toolbox/tongy/sortByResidual/dashboard/index.html
@tongyu-microsoft I'm curious what prompted this change. Was this a feature request from somebody? As a user, I'd be a bit surprised about the ordering since we had it by index so far and that's immediately obvious. The residual ordering is not immediately obvious unless we point it out in writing. Perhaps that should be added? If you haven't yet you may want to consult with a designer 🙂
https://responsibleai.blob.core.windows.net/pullrequest/microsoft/responsible-ai-toolbox/tongy/sortByResidual/dashboard/index.html
@tongyu-microsoft I'm curious what prompted this change. Was this a feature request from somebody? As a user, I'd be a bit surprised about the ordering since we had it by index so far and that's immediately obvious. The residual ordering is not immediately obvious unless we point it out in writing. Perhaps that should be added? If you haven't yet you may want to consult with a designer 🙂
@romanlutz, I think sorting based on some regression metric might be useful for understanding as to why model might be predicting with higher accuracy for some samples and doesn't predict the same way for other samples. For classification scenario as well, we differentiate between which samples were correctly predicted by model vs which were not correctly predicted. Agree, that we should document this in the dashboard and should use some standard regression metric like r2_score to order these.
I think it would also be nice to see the values by which these are sorted as a separate column. If they are sorted by abs(true_y - predicted_y), it might be nice to have a columns with name something like "Abs difference" right after the index, so it's clear that this is what the data is sorted by. It doesn't necessarily have to be part of this PR though.
I think it would also be nice to see the values by which these are sorted as a separate column. If they are sorted by abs(true_y - predicted_y), it might be nice to have a columns with name something like "Abs difference" right after the index, so it's clear that this is what the data is sorted by. It doesn't necessarily have to be part of this PR though.
@imatiach-msft Thanks for the great comment! Yeah this PR is on hold and we are waiting for Owen's design on this, so that we can have different sorting options for users, not limited to Abs difference :)