skops
skops copied to clipboard
ENH permutation importance
I implemented permutation importance, will write tests if you like where it goes.
I have two problems:
- Are we okay with
matplotlib
being a dependency here? I'll look for something better. Same applies forpandas
. - I'm looking for a more standard way of getting feature names. For some base estimators or pipelines it varies, from
feature_names_in_
toget_feature_names_out()
.
The example script looks like below (I'll change layout of plot to fit inside the plot itself):
@BenjaminBossan I had a quick huddle last week with @adrinjalali where we discussed this and he told me to create a method to add feature importance graphs. I can remove this and add it to model card example instead.
he told me to create a method to add feature importance graphs. I can remove this and add it to model card example instead.
If both of you agree that this is a good thing to have, I'm also fine with it. I just wanted to make sure that this decision is made deliberately, after taking into account the tradeoffs. Especially, it will probably result in quite a few very similar methods being added in the future (like the example of adding CV results).
@BenjaminBossan I sort of agree and think the dependencies are a bit much if we want to move forward (I'll handle stuff with numpy instead). Would you like to wait for @adrinjalali to come back from his vacay to make a decision?
Would you like to wait for @adrinjalali to come back from his vacay to make a decision?
If you can manage to remove the pandas dependency and if you already decided with Adrin that this would be a good feature to have, I think we can move forward.
Fixes #104
oh I love how testing on windows machines 🙂🙂🙂🙂🙂🙂
- said by no one, ever
C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\skops-test180xjo68\\\\importance.png
C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\skops-test180xjo68/importance.png
I was hitting recursion with my solution so Adrin swapped it with pytest based one. (we couldn't find why mine wouldn't work)
@merveenoyan The tests are still failing because of this import thing.
Regarding the merge conflict, we would have to change the API a bit with the new card implementation. But I can take that over or we work on that together.
I checked my local environment and how GHActions is setup yet I can't figure out why it doesn't raise the error here meanwhile it passes on my local (and Adrin's local too) I don't get it.
@BenjaminBossan can you take a look? Can't ask for a review again
@BenjaminBossan codecov is failing :/
@merveenoyan again, codecov worked early in the day and now starts failing...
@BenjaminBossan codecov fails 😅🥲
@merveenoyan Re-running solved the codecov issue. Not sure why codecov hates you specifically :D
Could you please merge the current main branch, as have now added CI for Python 3.11? Then I'll review again.
@BenjaminBossan merged
@merveenoyan Some tests are failing. This is mostly fixed after merging with current main branch. Could you please do that?
@BenjaminBossan should work now (magically ✨) but I realized pytest doesn't collect tests of parser for some reason on mac (assuming it's same on windows) (that's why nothing failed on my local, I guess)
I checked github workflow and looked for needs
in case the workflow doesn't run it unless ubuntu tests pass (since I know ubuntu tests are usually fast so it's good to add such prerequisite, but here it's not the case). Then I decided to run the tests in isolation on my local both for whole file and specifically on the test that is failing, yet it refuses to collect on my local too! 🤯
(py39) ➜ card git:(feature_importance) python3 -m pytest -sv tests/test_parser.py
================================================= test session starts =================================================
platform darwin -- Python 3.9.16, pytest-7.2.1, pluggy-1.0.0 -- /opt/anaconda3/envs/py39/bin/python3
cachedir: .pytest_cache
rootdir: /Users/mervenoyan/Desktop/skops/skops, configfile: pyproject.toml
plugins: flaky-3.7.0, cov-4.0.0, anyio-3.6.2
collected 0 items / 2 skipped
do you know what's going on?
I realized pytest doesn't collect tests of parser for some reason on mac (assuming it's same on windows) (that's why nothing failed on my local, I guess)
Probably it's just because you haven't installed pandoc locally? This line makes it so that tests are skipped when pandoc is not found:
https://github.com/skops-dev/skops/blob/d9b7c369e7998e0819ab63baea81f15a004d8418/skops/card/tests/test_parser.py#L14-L18
The reason why it runs on Ubuntu CI is:
https://github.com/skops-dev/skops/blob/d9b7c369e7998e0819ab63baea81f15a004d8418/.github/workflows/build-test.yml#L65-L67
I haven't added pandoc to the other builds because it's less trivial to install it on Mac and Windows and, as you mentioned, Ubuntu is the fastest.
@BenjaminBossan I re-ran for codecov but no chance, it's failing 🥲🥲🥲🥲