skops ENH permutation importance

I implemented permutation importance, will write tests if you like where it goes.

I have two problems:

Are we okay with matplotlib being a dependency here? I'll look for something better. Same applies for pandas.
I'm looking for a more standard way of getting feature names. For some base estimators or pipelines it varies, from feature_names_in_ to get_feature_names_out().

The example script looks like below (I'll change layout of plot to fit inside the plot itself): Ekran Resmi 2022-09-16 19 37 37

Sep 16 '22 17:09 merveenoyan

@BenjaminBossan I had a quick huddle last week with @adrinjalali where we discussed this and he told me to create a method to add feature importance graphs. I can remove this and add it to model card example instead.

Sep 19 '22 15:09 merveenoyan

he told me to create a method to add feature importance graphs. I can remove this and add it to model card example instead.

If both of you agree that this is a good thing to have, I'm also fine with it. I just wanted to make sure that this decision is made deliberately, after taking into account the tradeoffs. Especially, it will probably result in quite a few very similar methods being added in the future (like the example of adding CV results).

Sep 19 '22 15:09 BenjaminBossan

@BenjaminBossan I sort of agree and think the dependencies are a bit much if we want to move forward (I'll handle stuff with numpy instead). Would you like to wait for @adrinjalali to come back from his vacay to make a decision?

Sep 19 '22 15:09 merveenoyan

Would you like to wait for @adrinjalali to come back from his vacay to make a decision?

If you can manage to remove the pandas dependency and if you already decided with Adrin that this would be a good feature to have, I think we can move forward.

Sep 19 '22 15:09 BenjaminBossan

Fixes #104

Nov 07 '22 16:11 merveenoyan

oh I love how testing on windows machines 🙂🙂🙂🙂🙂🙂

said by no one, ever

C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\skops-test180xjo68\\\\importance.png

C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\skops-test180xjo68/importance.png

Nov 08 '22 17:11 merveenoyan

I was hitting recursion with my solution so Adrin swapped it with pytest based one. (we couldn't find why mine wouldn't work)

Dec 15 '22 10:12 merveenoyan

@merveenoyan The tests are still failing because of this import thing.

Regarding the merge conflict, we would have to change the API a bit with the new card implementation. But I can take that over or we work on that together.

Dec 16 '22 16:12 BenjaminBossan

I checked my local environment and how GHActions is setup yet I can't figure out why it doesn't raise the error here meanwhile it passes on my local (and Adrin's local too) I don't get it.

Jan 06 '23 13:01 merveenoyan

@BenjaminBossan can you take a look? Can't ask for a review again

Jan 20 '23 14:01 merveenoyan

@BenjaminBossan codecov is failing :/

Jan 20 '23 15:01 merveenoyan

@merveenoyan again, codecov worked early in the day and now starts failing...

Jan 20 '23 16:01 BenjaminBossan

@BenjaminBossan codecov fails 😅🥲

Jan 23 '23 11:01 merveenoyan

@merveenoyan Re-running solved the codecov issue. Not sure why codecov hates you specifically :D

Could you please merge the current main branch, as have now added CI for Python 3.11? Then I'll review again.

Jan 23 '23 13:01 BenjaminBossan

@BenjaminBossan merged

Jan 24 '23 18:01 merveenoyan

@merveenoyan Some tests are failing. This is mostly fixed after merging with current main branch. Could you please do that?

Jan 26 '23 10:01 BenjaminBossan

@BenjaminBossan should work now (magically ✨) but I realized pytest doesn't collect tests of parser for some reason on mac (assuming it's same on windows) (that's why nothing failed on my local, I guess)

I checked github workflow and looked for needs in case the workflow doesn't run it unless ubuntu tests pass (since I know ubuntu tests are usually fast so it's good to add such prerequisite, but here it's not the case). Then I decided to run the tests in isolation on my local both for whole file and specifically on the test that is failing, yet it refuses to collect on my local too! 🤯

(py39) ➜  card git:(feature_importance) python3 -m pytest -sv tests/test_parser.py
================================================= test session starts =================================================
platform darwin -- Python 3.9.16, pytest-7.2.1, pluggy-1.0.0 -- /opt/anaconda3/envs/py39/bin/python3
cachedir: .pytest_cache
rootdir: /Users/mervenoyan/Desktop/skops/skops, configfile: pyproject.toml
plugins: flaky-3.7.0, cov-4.0.0, anyio-3.6.2
collected 0 items / 2 skipped

do you know what's going on?

Jan 29 '23 22:01 merveenoyan

I realized pytest doesn't collect tests of parser for some reason on mac (assuming it's same on windows) (that's why nothing failed on my local, I guess)

Probably it's just because you haven't installed pandoc locally? This line makes it so that tests are skipped when pandoc is not found:

https://github.com/skops-dev/skops/blob/d9b7c369e7998e0819ab63baea81f15a004d8418/skops/card/tests/test_parser.py#L14-L18

The reason why it runs on Ubuntu CI is:

https://github.com/skops-dev/skops/blob/d9b7c369e7998e0819ab63baea81f15a004d8418/.github/workflows/build-test.yml#L65-L67

I haven't added pandoc to the other builds because it's less trivial to install it on Mac and Windows and, as you mentioned, Ubuntu is the fastest.

Jan 30 '23 10:01 BenjaminBossan

@BenjaminBossan I re-ran for codecov but no chance, it's failing 🥲🥲🥲🥲

Jan 30 '23 12:01 merveenoyan

skops skops copied to clipboard

ENH permutation importance

skops
skops copied to clipboard