skops icon indicating copy to clipboard operation
skops copied to clipboard

ENH permutation importance

Open merveenoyan opened this issue 2 years ago • 4 comments

I implemented permutation importance, will write tests if you like where it goes.

I have two problems:

  • Are we okay with matplotlib being a dependency here? I'll look for something better. Same applies for pandas.
  • I'm looking for a more standard way of getting feature names. For some base estimators or pipelines it varies, from feature_names_in_ to get_feature_names_out().

The example script looks like below (I'll change layout of plot to fit inside the plot itself): Ekran Resmi 2022-09-16 19 37 37

merveenoyan avatar Sep 16 '22 17:09 merveenoyan

@BenjaminBossan I had a quick huddle last week with @adrinjalali where we discussed this and he told me to create a method to add feature importance graphs. I can remove this and add it to model card example instead.

merveenoyan avatar Sep 19 '22 15:09 merveenoyan

he told me to create a method to add feature importance graphs. I can remove this and add it to model card example instead.

If both of you agree that this is a good thing to have, I'm also fine with it. I just wanted to make sure that this decision is made deliberately, after taking into account the tradeoffs. Especially, it will probably result in quite a few very similar methods being added in the future (like the example of adding CV results).

BenjaminBossan avatar Sep 19 '22 15:09 BenjaminBossan

@BenjaminBossan I sort of agree and think the dependencies are a bit much if we want to move forward (I'll handle stuff with numpy instead). Would you like to wait for @adrinjalali to come back from his vacay to make a decision?

merveenoyan avatar Sep 19 '22 15:09 merveenoyan

Would you like to wait for @adrinjalali to come back from his vacay to make a decision?

If you can manage to remove the pandas dependency and if you already decided with Adrin that this would be a good feature to have, I think we can move forward.

BenjaminBossan avatar Sep 19 '22 15:09 BenjaminBossan

Fixes #104

merveenoyan avatar Nov 07 '22 16:11 merveenoyan

oh I love how testing on windows machines 🙂🙂🙂🙂🙂🙂

  • said by no one, ever
C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\skops-test180xjo68\\\\importance.png

C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\skops-test180xjo68/importance.png

merveenoyan avatar Nov 08 '22 17:11 merveenoyan

I was hitting recursion with my solution so Adrin swapped it with pytest based one. (we couldn't find why mine wouldn't work)

merveenoyan avatar Dec 15 '22 10:12 merveenoyan

@merveenoyan The tests are still failing because of this import thing.

Regarding the merge conflict, we would have to change the API a bit with the new card implementation. But I can take that over or we work on that together.

BenjaminBossan avatar Dec 16 '22 16:12 BenjaminBossan

I checked my local environment and how GHActions is setup yet I can't figure out why it doesn't raise the error here meanwhile it passes on my local (and Adrin's local too) I don't get it.

merveenoyan avatar Jan 06 '23 13:01 merveenoyan

@BenjaminBossan can you take a look? Can't ask for a review again

merveenoyan avatar Jan 20 '23 14:01 merveenoyan

@BenjaminBossan codecov is failing :/

merveenoyan avatar Jan 20 '23 15:01 merveenoyan

@merveenoyan again, codecov worked early in the day and now starts failing...

BenjaminBossan avatar Jan 20 '23 16:01 BenjaminBossan

@BenjaminBossan codecov fails 😅🥲

merveenoyan avatar Jan 23 '23 11:01 merveenoyan

@merveenoyan Re-running solved the codecov issue. Not sure why codecov hates you specifically :D

Could you please merge the current main branch, as have now added CI for Python 3.11? Then I'll review again.

BenjaminBossan avatar Jan 23 '23 13:01 BenjaminBossan

@BenjaminBossan merged

merveenoyan avatar Jan 24 '23 18:01 merveenoyan

@merveenoyan Some tests are failing. This is mostly fixed after merging with current main branch. Could you please do that?

BenjaminBossan avatar Jan 26 '23 10:01 BenjaminBossan

@BenjaminBossan should work now (magically ✨) but I realized pytest doesn't collect tests of parser for some reason on mac (assuming it's same on windows) (that's why nothing failed on my local, I guess)

I checked github workflow and looked for needs in case the workflow doesn't run it unless ubuntu tests pass (since I know ubuntu tests are usually fast so it's good to add such prerequisite, but here it's not the case). Then I decided to run the tests in isolation on my local both for whole file and specifically on the test that is failing, yet it refuses to collect on my local too! 🤯

(py39) ➜  card git:(feature_importance) python3 -m pytest -sv tests/test_parser.py
================================================= test session starts =================================================
platform darwin -- Python 3.9.16, pytest-7.2.1, pluggy-1.0.0 -- /opt/anaconda3/envs/py39/bin/python3
cachedir: .pytest_cache
rootdir: /Users/mervenoyan/Desktop/skops/skops, configfile: pyproject.toml
plugins: flaky-3.7.0, cov-4.0.0, anyio-3.6.2
collected 0 items / 2 skipped  

do you know what's going on?

merveenoyan avatar Jan 29 '23 22:01 merveenoyan

I realized pytest doesn't collect tests of parser for some reason on mac (assuming it's same on windows) (that's why nothing failed on my local, I guess)

Probably it's just because you haven't installed pandoc locally? This line makes it so that tests are skipped when pandoc is not found:

https://github.com/skops-dev/skops/blob/d9b7c369e7998e0819ab63baea81f15a004d8418/skops/card/tests/test_parser.py#L14-L18

The reason why it runs on Ubuntu CI is:

https://github.com/skops-dev/skops/blob/d9b7c369e7998e0819ab63baea81f15a004d8418/.github/workflows/build-test.yml#L65-L67

I haven't added pandoc to the other builds because it's less trivial to install it on Mac and Windows and, as you mentioned, Ubuntu is the fastest.

BenjaminBossan avatar Jan 30 '23 10:01 BenjaminBossan

@BenjaminBossan I re-ran for codecov but no chance, it's failing 🥲🥲🥲🥲

merveenoyan avatar Jan 30 '23 12:01 merveenoyan