skglm FEAT - Sample weights

Description of the feature

Sklearn 'fit' functions have an option to pass 'sample_weights' array to assign importance to samples and modify the cost function accordingly.

Additional context Wanted to use 'https://github.com/SteiMi/denseweight' with MCP.

Feb 13 '24 19:02 sujay-pandit

Hi @sujay-pandit, thanks for reaching out.

At first sight it should be doable to add a sample_weights attribute to a datafit, say Quadratic, and to take it into account:

when computing the datafit value
when computing the datafit gradient
when computing the lipschitz constant

The calculation should simply amount to appropriate mutliplications by sample_weights[i]

Would you like to send a Pull request to start working on it? We can guide you and help you through the process

Feb 14 '24 07:02 mathurinm

Yes, I can take that up. Thanks! Can you start a branch for me?

Feb 15 '24 17:02 sujay-pandit

If I start the branch, you won't be able to push to it. You need to fork the repository, and push to a branch in your fork, and then launch a pull request here. see https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork

ChatGPT may also help on that matter :)

Feb 19 '24 08:02 mathurinm

@sujay-pandit scikit-learn has a great explanation for pull requests, you can follow it (adapting the URLs, eg replacing scikit-learn by skglm and others) : https://scikit-learn.org/stable/developers/contributing.html#how-to-contribute

Feb 19 '24 09:02 mathurinm

Got it! Thanks @mathurinm

Feb 20 '24 17:02 sujay-pandit

Any update here? This is a key feature for many problems, so am likewise very interested.

Mar 08 '24 17:03 kmedved

Pinging @sujay-pandit on this topic

Mar 08 '24 17:03 mathurinm

Hey @kmedved and @mathurinm, I did get started on it but need to finish some work for my paper submission in April. I will start working on this as soon as possible.

Mar 09 '24 00:03 sujay-pandit

Thanks @sujay-pandit If you already started writing some code, can you please open a draft PR, so that other contributors can give early feedback and build on this work?

Mar 11 '24 09:03 mathurinm

Hi @mathurinm and @kmedved
Apologies for the delay in communication. I am starting afresh now.
Before I make changes, I wanted to get a confirmation of my plan. For now, my focus is on implementing this for MCPRegression, which I can later extend to others.
I think majority of my changes are going to be in the datafit (Quadratic) class while computing loss and gradient w.r.t to weights. Let me know if the plan below looks ok.

I will add a parameter to MCPRegression constructor called 'sample_weights', which will default to None. [class MCPRegression, estimators.py]
The same parameter will also be added to the solver [AndersonCD, anderson_cd.py]

I feel the majority of my changes are here, in datafit class [class Quadratic, single_task.py]

Datafit (single_task.py)
Below are the changes I have in mind. I will scale the loss computation by sample_weights in the value method. And, while computing gradients per weight (w_j) in the gradient_scalar method as follows:

class Quadratic(BaseDatafit):
    def __init__(self, sample_weights=None):    
        self.sample_weights = sample_weights  

    def value(self, y, w, Xw):  
        if self.sample_weights is not None:  
            return np.sum((self.sample_weights * (y - Xw) ** 2)) / (2 * len(y))  
        else:  
            return np.sum((y - Xw) ** 2) / (2 * len(y))  

    def gradient_scalar(self, X, y, w, Xw, j):  
        if self.sample_weights is not None:  
            return (X[:, j] * (Xw - y) * self.sample_weights).sum() / len(y)  
        else:  
            return (X[:, j] * (Xw - y)).sum() / len(y)

Question: I am not expecting my penalty(MCPenalty/WeightedMCPenalty) classes to change. Is my understanding correct?

Thanks!

May 10 '24 19:05 sujay-pandit

Hello @sujay-pandit , Thanks a lot for the incoming PR. My plan would be:

1 First, add the sample_weights to the Quadratic class, note that you will also have to change the Lipschitz constant computation (some work has already been started in #245 )
2 Then, before touching the estimator, I would check that the created datafit works a penalty (L1, MCP) with AndersonCD, like done in this example https://contrib.scikit-learn.org/skglm/auto_examples/plot_sparse_recovery.html#sphx-glr-auto-examples-plot-sparse-recovery-py
3 Finally, modify the MCPRegression constructor as you said

IMO sample_weights should not be passed to AndersonCD: sample_weights should be passed to Quadratid, which itself will be passed to the solver.

Question: I am not expecting my penalty(MCPenalty/WeightedMCPenalty) classes to change. Is my understanding correct?

Yes indeed! No penalty should be modified when adding this new feature.

Quick question from my side: for the design, do we want to modify the Quadratic class, or to create a WeightedQuadratic class, which would inherit from Quadratic ? (I would vote for a new class)

May 10 '24 21:05 QB3

Hi @QB3 Thanks for the prompt feedback!

1 First, add the sample_weights to the Quadratic class, note that you will also have to change the Lipschitz constant computation (some work has already been started in Altered Quadratic Class #245 )

Yes. you are right; for Lipschitz computation, I will have to scale the computed values by sample weights.

2 Then, before touching the estimator, I would check that the created datafit works a penalty (L1, MCP) with AndersonCD, like done in this example https://contrib.scikit-learn.org/skglm/auto_examples/plot_sparse_recovery.html#sphx-glr-auto-examples-plot-sparse-recovery-py

Sounds good

3 Finally, modify the MCPRegression constructor as you said

IMO sample_weights should not be passed to AndersonCD: sample_weights should be passed to Quadratid, which itself will be passed to the solver.

Agreed, that will be cleaner.

Quick question from my side: for the design, do we want to modify the Quadratic class, or to create a WeightedQuadratic class, which would inherit from Quadratic ? (I would vote for a new class)

Are there any more changes planned along with sample weights? I don't think the addition of sample weights is going to need a lot of changes to the existing class. If not, I would vote to just modify the existing class.

May 14 '24 02:05 sujay-pandit

My take: To support both classic Quadratic and weighted Quadratic in one class I see only two options:

have a Quadratic.weights attribute, which is None if no weights are used and a np.array of shape (n_samples,) otherwise. We keep the existing behavior if self.weights is None. This is implemented in #245, but unfortunately I don't think this mixed typing (None or np.array) is supported by numba, see the test failure here : https://github.com/scikit-learn-contrib/skglm/actions/runs/8631223494/job/23660827782?pr=245#step:6:375
have a Quadratic.weights attribute, which is np.ones(n_samples) if there are no weights, and a np.array of shape (n_samples,) otherwise. I'm afraid this would slow down the code in the unweighted case; I also don't see how to set the weights to 1 by default (without using a mutable default parameter value, which is dangerous)

So it's probably easier to go for two separate classes, Quadratic and WeightedQuadratic. This should make the PR straightforward.

May 14 '24 06:05 mathurinm

@mathurinm and @QB3, In this case, we can go for separate classes. I wasn't aware of this numba issue. I have a question, The changes in #245 and this issue will clash. Can we keep just one of them? Or should I wait until the changes in #245 are complete?

May 14 '24 15:05 sujay-pandit

#245 is stalled, you can start a separate PR.

You can even start from it, to reuse existing work. To do so first fetch the PR branch, then switch to a new branch:

git fetch [email protected]:scikit-learn-contrib/skglm pull/245/head:nbb
git switch nbb
git switch -c your_desired_branch_name

and start working as usual on a branch.

May 14 '24 16:05 mathurinm

@sujay-pandit are you working on this ? Some people asked for it offline so if you have not started already, I will implement it. But if you're working on this i'm happy to give feedback

May 28 '24 10:05 mathurinm

@mathurinm I will push my changes by Thursday EOD. If I am not able to I will hand it over to you guys. Apologies for the delay.

May 28 '24 15:05 sujay-pandit

Hi @mathurinm @kmedved @QB3 I have added a new class 'WeightedQuadratic' in datafits/single_task.py and created a PR. I have tested it with examples/plot_sparse_recovery.py and it seems to work fine (with both all samples weighted equally to 1 and with weights generated using the DenseWeight library). I haven't added the changes to MCPRegression in estimators.py. If these changes are ok then I can move on to them. Waiting for your feedback, Thanks!

May 31 '24 17:05 sujay-pandit

This was implemented in #258 through a WeightedQuadratic class :rocket:

@sujay-pandit thanks one more time for the work, we look forward to hearing some feedback from you. Please spread the word about skglm too !

Jun 24 '24 14:06 mathurinm

skglm skglm copied to clipboard

FEAT - Sample weights

Description of the feature

skglm
skglm copied to clipboard