skops
skops copied to clipboard
ENH - Gets SciKeras script working
WIP still need to fix one test
Reference Issues/PRs
Fixes #388
What does this implement/fix? Explain your changes.
This fixes a recursion error that was happening when dumping a scikeras model.
Any other comments?
It looks like all the pytest tests are failing with You have exceeded our daily quotas for action: createRepo. We invite you to retry later.
Should be addressed by #398
@lazarust Could you please solve the merge conflict? Regarding the uncovered new line: Would that be solved by adding tests for scikeras?
@BenjaminBossan I've fixed the merge conflict.
Yeah, I believe it would be but I was unsure if we wanted to have tests that included another library like that. Should I add one?
Yeah, I believe it would be but I was unsure if we wanted to have tests that included another library like that. Should I add one?
Yes, it would be good, since that was the initial reason for the change. We have external library tests, see here:
https://github.com/skops-dev/skops/blob/main/skops/io/tests/test_external.py
For scikeras, it won't be possible to add a comprehensive coverage of all possible models, so a simple example should be enough. If it's possible to add a unit test independently of scikeras that explicitly tests weakrefs, that would also be good, not sure how easy it is to do.
Regarding the failing tests, at first glance, it appears to be a change in the model repr in the latest sklearn version, so not related to the changes here.
@BenjaminBossan Sorry this took me so long to get back to. I've added a test to hit that line.
@adrinjalali The tests for sklearn nightly are failing because the model repr was changed (not sure why). Normally, we could fix that by having the test check the sklearn verison, but this is a doctest. Any idea how it can be fixed, short of skipping it whole?
Ugh, the list of trusted modules is giant now :D I guess it's related to the tensorflow change. Could you please explain why that was necessary? Also, we now get this error on CI:
E AttributeError: module 'scikeras' has no attribute 'wrappers'
@BenjaminBossan Yeah, sorry I realized I had marked the test method as a @pytest.fixture
🤦🏽, so the test wasn't ever running (which is why it wasn't erroring when switching from dump
to dumps
). I'm hoping to get the test fixed and cleanup that list of trusted modules today.
@BenjaminBossan After staring at this all day, I could use some help lol. For some reason, there's some infinite recursion happening when constructing the tree that I can't figure out why.
Initially, I thought it was due to the CachedNode
s in the tree, but the construct
method isn't getting hit for those. I've gone through the tree and didn't see anything outrageous so if you could take a look that'd be great!
I'll have to check when I'm back. Still off till end of November. But I remember CacheNode
caused some issues when I was working on it, partly due to small integers and other common objects having the same id
in python.
@adrinjalali It looks like you're right, there are some JSONNodes that have the same id
as some of the booleans that are getting cached.
To me, there are a couple of solutions:
- Find a better way to identify objects uniquely
- Maybe we should use uuid instead?
- Don't cache boolean values since these seem to have duplicate ids.
- Include the data type in the
__id__
(so it would be something like "id/dtype")- This isn't the best idea since we'd have to update everywhere that uses that id.
Find a better way to identify objects uniquely
- Maybe we should use uuid instead?
Could you explain further how that would work?
Don't cache boolean values since these seem to have duplicate ids.
Note that booleans are not the only values that could be affected, e.g. small integers will have the same issue.
@BenjaminBossan On that first point, I don't think that would work anymore. I had thought uuid
was just a more complicated id
function but looking into the documentation it seems to be doing something differently than I understood.
I also agree with your note, but I'm unsure of the best way forward. Do you have any thoughts?
Thinking about this, WDYT of saving TF objects using TF and saving them in our zipfile, not trusting any TF object by default, and when raising a warning about untrusted code, we also point users to this page (https://github.com/tensorflow/tensorflow/blob/master/SECURITY.md)
@adrinjalali That sounds good to me! After reading through the TensorFlow security documentation, showing the warning makes sense.
Would we show the warning on dumps
or only on loads
? I would think both probably but I could see an argument to just show it on loads
.
It would only be on loading time, and by default load fails anyway, we just need to make sure when it fails, we also give them the right information and the right link.
@adrinjalali Sweet, so things that need to be changed:
- When dumping a TensorFlow model save the TF object using TF and store it in the .zip file.
- When loading a Tensorflow model raise a warning linking to the TF docs a. Since the model fails to load anyway, we're just changing the error message at this point, if I'm understanding things correctly.
Do I have that right? Thanks for your help on this solution.
@adrinjalali I am just pinging you again to see if you have any thoughts before I get started on a solution.
@lazarust yes, that sounds perfect to me.
@adrinjalali @BenjaminBossan This is ready for y'all to look at again! Sorry it took a while, got a little busy around the holidays.
@adrinjalali This is ready for you to take a look at again. The failing tests are due to this bug in scikeras https://github.com/adriangb/scikeras/issues/313, so we'll probably need to wait for that to get fixed before moving this forward
I think you forgot a future import and old python is complaining.
@adrinjalali I got it working for older versions of Python other than Windows on 3.8 (I think the Ubuntu 3.9 failure was just a codecov issue). Could you help me figure out how to get it working? Downgrading tensorflow to 2.11 just breaks more versions.
One of the main reasons we support 3.8 was colab, which now supports 3.10. I think we can drop 3.8 alltogether.
Would you be kind enough to submit a PR to remove 3.8 support?
@adrinjalali I created this PR to remove 3.8 support #418, but I'm running into an issue where pip install scikit-learn-intelex
isn't finding any valid versions only on MacOS (this is happening both in the GH action and locally for me). I'm still looking into a fix
MacOS issue could be this: https://github.com/actions/setup-python/issues/850
Intelex is very brittle, and I don't think it's used much if at all. I'd be in favor of removing it. It was only to handle the inference side and HF inference is not that well maintained anyway.
@BenjaminBossan any objections to remove it?
Works for me.
@lazarust you've been very kind here. Would you remove sklearn-intelex
in a separate PR too?