skops icon indicating copy to clipboard operation
skops copied to clipboard

Add Method type in dispatch calls

Open E-Aho opened this issue 3 years ago • 21 comments

Fixes https://github.com/skops-dev/skops/issues/184

PR which adds MethodType and related functions to the skops.io dispatch functions.

I can expand on the test if more cases/checks are needed, but for now this shows the functionality works as expected.

E-Aho avatar Oct 17 '22 21:10 E-Aho

Well done figuring out where the error occurs.

Before we continue, I really want to understand what is going on here and if the suggested solution is the best one.

So first, it's true that we don't have any special case for objects whose attributes are methods bound to another object. E.g. this fails for now:

class Foo(BaseEstimator):
    def fit(self, X, y=None, **fit_params):
        self.bound_method_ = FunctionTransformer().transform
        return self
    def transform(self, X):
        return self.bound_method_(X)

def test_bound_method(tmp_path):
    estimator = Foo().fit(None)
    f_name = tmp_path / "file.skops"
    loaded = save_load_round(estimator, f_name)

gives:

    def object_get_instance(state, src):
        if state.get("is_json", False):
            return json.loads(state["content"])
    
        cls = gettype(state)
>       instance = cls.__new__(cls)
E       TypeError: method expected 2 arguments, got 0

I think we should add a test along these lines to our test suite.

How does this relate to the original issue? From what I can tell, the problem stems from these lines:

https://github.com/scipy/scipy/blob/1023d9207fdc1430a8ba196f1a1616ac3c264acf/scipy/stats/_distn_infrastructure.py#L3183-L3186

The vecentropy attribute is the one causing the issue. It is a numpy.vectorize of a bound method, but it's not actually bound to another instance, it's bound to itself! Still, it is recognized as a bound method, hence dispatches to the function that @E-Aho implemented.

What is strange to me is that the same issue does not apply to self._cdfvec, even though it's treated the same as vecentropy.

Another thing that surprised me is the following: I removed the whole vecentropy object from the state (I did this by going inside of dict_get_state and if the key is "vecentropy", I just skip it). I would have thought that this cannot work, but it does, the test passes without requiring method_get_state/instance, and the vecentropy attribute exists after loading.

I dug into this and I think it's because _attach_methods is called during __setstate__:

https://github.com/scipy/scipy/blob/1023d9207fdc1430a8ba196f1a1616ac3c264acf/scipy/stats/_distn_infrastructure.py#L636

If that's true, we should be able to drop those methods entirely from the state and let scipy restore them during __setstate__. WDYT?

Regardless of that, as mentioned at the top, we should have a way to save and load methods bound to other instances.

I also wondered whether we can add a special case for the numpy.vectorize class, similar to what we have with partial. Maybe that can solve the initial problem, but it wouldn't solve the general bound method issue.

BenjaminBossan avatar Oct 18 '22 11:10 BenjaminBossan

If that's true, we should be able to drop those methods entirely from the state and let scipy restore them during setstate. WDYT?

I fully agree, I ran into this as well while testing, but with "dummy functions" that returned nothing, a few other tests that checked on the dispatch functions started failing too.

I think in this case, it could definitely be solved by dropping these methods and adding a special condition to check for this during loading.

However, in a general case, I think there still needs to be a solution for bound methods more generally, which I think would be along the lines of this PR. E.g:

    class MethodHolder:
        def __init__(self, variant):
            if variant == "sqrt":
                self._method = np.sqrt
            elif variant == "log":
                self._method = np.log
            else:
                self._method = np.exp

        def func(self, x):
            return  self._method(x)

    a = MethodHolder("log")
    b = a.func
 

I feel like something along these lines isn't unlikely to exist somewhere in SKLearn/SciPy (given we've already found one that happens to get bounded during init), and it would probably be a good idea to provide support for serializing it correctly, which shouldn't be too hard to do.

E-Aho avatar Oct 18 '22 12:10 E-Aho

I think there still needs to be a solution for bound methods more generally, which I think would be along the lines of this PR

:+1: This is exactly what I meant when I wrote

we should have a way to save and load methods bound to other instances

We have to be precise with the wording here, we already persist objects with bound methods without issue, the problem are attributes that are methods bound to another object than the one we're persisting.

Once we have this general issue solved, if it doesn't solve the scipy issue, we can think about a more specific solution there.

BenjaminBossan avatar Oct 18 '22 12:10 BenjaminBossan

We have to be precise with the wording here, we already persist objects with bound methods without issue, the problem are attributes that are methods bound to another object than the one we're persisting.

Ah sorry, you're entirely correct. I neglected it in my comment, but I was meaning to show that if you used that b in a different object or function, then serialized that, that would be where this bug exists (to my understanding).

e.g if we had

    a = MethodHolder("log")
    b = a.func
    c = SomeOtherObject(func=b)

And tried to serialise c, this would happen I believe.

I think one other place this might occur for users is when trying to serialize something that can take a function as an input (e.g DistanceMetrics with a user defined distance).

If that function is a method bound to a different object, it wouldn't work right now AFAIK

E-Aho avatar Oct 18 '22 12:10 E-Aho

And tried to serialise c, this would happen I believe.

Yes, I think it would fail. Is there a qualitative difference to the other example I posted:

class Foo(BaseEstimator):
    def fit(self, X, y=None, **fit_params):
        self.bound_method_ = FunctionTransformer().transform
        return self
    def transform(self, X):
        return self.bound_method_(X)

BenjaminBossan avatar Oct 18 '22 13:10 BenjaminBossan

No, both would error, I was trying to think through a case where you would need to support bound methods that get manipulated during class creation/use and couldn't be static

(e.g depends on the "variant" param in my example)

E-Aho avatar Oct 18 '22 13:10 E-Aho

No, both would error, I was trying to think through a case where you would need to support bound methods that get manipulated during class creation/use and couldn't be static

Ah okay, got it, I thought maybe I was missing something.

BenjaminBossan avatar Oct 18 '22 13:10 BenjaminBossan

CI runner is failing with a 404 error on uploading to CodeCov 🤔

E-Aho avatar Oct 19 '22 14:10 E-Aho

CI runner is failing with a 404 error on uploading to CodeCov thinking

Most of the time, it's just some flaky behavior that can be fixed by rerunning the CI.

BenjaminBossan avatar Oct 19 '22 14:10 BenjaminBossan

Should be ready for review when you have a moment @BenjaminBossan :) let me know if you have any thoughts!

E-Aho avatar Oct 20 '22 15:10 E-Aho

Overall this looks good, thanks a lot for investigating and proposing a solution. I made a few comments, please have a look.

Thanks for the comments! And it was a fun problem to investigate :)

And just to be clear, the initial issue reported in #184 is not solved yet, right? When I test it, I get an error (infinite recursion). Since we changed the skope (haha) of this PR, that's fine, I just want to make sure that this is expected.

😨

I definitely changed my focus from the original issue to a more general case, but I'm surprised it hasn't addressed the original issue, given it was working with just empty function calls for method_get_state and method_get_instance...

I'll look into this and see if there's a way to address the original bug here too. I don't love the idea of merging in a bug fix if it introduces a new bug, especially one like an infinite recursion 😨

E-Aho avatar Oct 21 '22 12:10 E-Aho

So after some digging, I think the problem with the original issue being an infinite loop is as follows:

  1. We try to persist zipf as an object
  2. We get the states of the attributes of zipf which includes self.vecentropy, which is just numpy.vectorize(self._entropy)
  3. We try to persist the bound method, self._entropy, which depends on the state of the original zipf object, so try to persist statezipf
  4. goto 1)

This results in the loop.

I don't think this is a common pattern, so could add a special case for any rv_discrete objects, but I'm gonna try to see if I can think of a way to work around this for general cases of circular bound methods like this.

E-Aho avatar Oct 22 '22 16:10 E-Aho

Great that you figured out the cause. I think for this PR, it's not necessary to solve the whole circular reference problem, that might be best left for another day or else the PR will become too big.

As for potential solutions, we might be able to use the memoization mechanism that we already added here. There is an example of its usage for numpy arrays (which I now notice could be optimized...).

BenjaminBossan avatar Oct 22 '22 21:10 BenjaminBossan

Sounds good, I can raise an issue once this gets merged in to track the issue that's occuring with zipf right now 👍

E-Aho avatar Oct 23 '22 11:10 E-Aho

Huh, looks like given the changes in #200, neither FunctionType nor MethodType ever get returned by gettype.

E-Aho avatar Oct 25 '22 08:10 E-Aho

Huh, looks like given the changes in #200, neither FunctionType nor MethodType ever get returned by gettype.

Ah yes, this is not because gettype was changed in any way, this is because we no longer call gettype to determine kind of get_instance function to dispatch to.

BenjaminBossan avatar Oct 25 '22 09:10 BenjaminBossan

@BenjaminBossan if you have a minute could you rerun the CI?

E-Aho avatar Oct 25 '22 09:10 E-Aho

if you have a minute could you rerun the CI?

Sure, but feel free to ignore those "Error: Codecov: Failed to properly upload" messages. For some reason, those errors occur more frequently since the past week or so, but they're unrelated to anything actually being wrong.

BenjaminBossan avatar Oct 25 '22 09:10 BenjaminBossan

Oh for sure, I just wanted to check Codecov was actually happy now, the MethodType return not getting tested was making it upset

E-Aho avatar Oct 25 '22 09:10 E-Aho

I just wanted to check Codecov was actually happy now, the MethodType return not getting tested was making it upset

For sure, but you can take any of the passing CI runs to check that, no need to wait for each one to confirm.

BenjaminBossan avatar Oct 25 '22 09:10 BenjaminBossan

@BenjaminBossan, let me know if you have a moment to check the changes since your last review :)

E-Aho avatar Oct 25 '22 14:10 E-Aho

Ok, I've reverted this back to prior to the LoadState work.

I'll open an issue once this merges in and try to keep working on a LoadState that can load single instances across any object type :)

E-Aho avatar Oct 28 '22 19:10 E-Aho

Awesome, thanks for all the support guys!

I've reworded the xfail reason to be a bit clearer, I'm happy to merge this in if you're happy @BenjaminBossan

E-Aho avatar Oct 31 '22 11:10 E-Aho

@E-Aho Excellent, thanks, really great job.

Will you be working on the deduplication issue?

BenjaminBossan avatar Oct 31 '22 11:10 BenjaminBossan

@E-Aho Excellent, thanks, really great job.

Will you be working on the deduplication issue?

🤗 Thanks for all the support, it really was a huge help.

And yes, I'll keep working on the dupe issue, I'll raise an issue later today once I've got time, and start working on implementing a proper LoadState based on the comments in this PR.

E-Aho avatar Oct 31 '22 11:10 E-Aho