skglm
skglm copied to clipboard
FEAT - Sample weights
Description of the feature
Sklearn 'fit' functions have an option to pass 'sample_weights' array to assign importance to samples and modify the cost function accordingly.
Additional context Wanted to use 'https://github.com/SteiMi/denseweight' with MCP.
Hi @sujay-pandit, thanks for reaching out.
At first sight it should be doable to add a sample_weights
attribute to a datafit, say Quadratic
, and to take it into account:
- when computing the datafit value
- when computing the datafit gradient
- when computing the lipschitz constant
The calculation should simply amount to appropriate mutliplications by sample_weights[i]
Would you like to send a Pull request to start working on it? We can guide you and help you through the process
Yes, I can take that up. Thanks! Can you start a branch for me?
If I start the branch, you won't be able to push to it. You need to fork the repository, and push to a branch in your fork, and then launch a pull request here. see https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork
ChatGPT may also help on that matter :)
@sujay-pandit scikit-learn has a great explanation for pull requests, you can follow it (adapting the URLs, eg replacing scikit-learn by skglm and others) : https://scikit-learn.org/stable/developers/contributing.html#how-to-contribute
Got it! Thanks @mathurinm
Any update here? This is a key feature for many problems, so am likewise very interested.
Pinging @sujay-pandit on this topic
Hey @kmedved and @mathurinm, I did get started on it but need to finish some work for my paper submission in April. I will start working on this as soon as possible.
Thanks @sujay-pandit If you already started writing some code, can you please open a draft PR, so that other contributors can give early feedback and build on this work?
Hi @mathurinm and @kmedved
Apologies for the delay in communication. I am starting afresh now.
Before I make changes, I wanted to get a confirmation of my plan. For now, my focus is on implementing this for MCPRegression, which I can later extend to others.
I think majority of my changes are going to be in the datafit (Quadratic) class while computing loss and gradient w.r.t to weights. Let me know if the plan below looks ok.
-
I will add a parameter to MCPRegression constructor called 'sample_weights', which will default to None. [class MCPRegression, estimators.py]
-
The same parameter will also be added to the solver [AndersonCD, anderson_cd.py]
-
I feel the majority of my changes are here, in datafit class [class Quadratic, single_task.py]
Datafit (single_task.py)
Below are the changes I have in mind. I will scale the loss computation by sample_weights in the value method. And, while computing gradients per weight (w_j) in the gradient_scalar method as follows:class Quadratic(BaseDatafit): def __init__(self, sample_weights=None): self.sample_weights = sample_weights def value(self, y, w, Xw): if self.sample_weights is not None: return np.sum((self.sample_weights * (y - Xw) ** 2)) / (2 * len(y)) else: return np.sum((y - Xw) ** 2) / (2 * len(y)) def gradient_scalar(self, X, y, w, Xw, j): if self.sample_weights is not None: return (X[:, j] * (Xw - y) * self.sample_weights).sum() / len(y) else: return (X[:, j] * (Xw - y)).sum() / len(y)
Question: I am not expecting my penalty(MCPenalty/WeightedMCPenalty) classes to change. Is my understanding correct?
Thanks!
Hello @sujay-pandit , Thanks a lot for the incoming PR. My plan would be:
- 1 First, add the
sample_weights
to theQuadratic
class, note that you will also have to change the Lipschitz constant computation (some work has already been started in #245 ) - 2 Then, before touching the estimator, I would check that the created datafit works a penalty (L1, MCP) with
AndersonCD
, like done in this example https://contrib.scikit-learn.org/skglm/auto_examples/plot_sparse_recovery.html#sphx-glr-auto-examples-plot-sparse-recovery-py - 3 Finally, modify the
MCPRegression
constructor as you said
IMO sample_weights
should not be passed to AndersonCD
: sample_weights
should be passed to Quadratid, which itself will be passed to the solver.
Question: I am not expecting my penalty(MCPenalty/WeightedMCPenalty) classes to change. Is my understanding correct?
Yes indeed! No penalty should be modified when adding this new feature.
Quick question from my side: for the design, do we want to modify the Quadratic
class, or to create a WeightedQuadratic
class, which would inherit from Quadratic
? (I would vote for a new class)
Hi @QB3 Thanks for the prompt feedback!
- 1 First, add the
sample_weights
to theQuadratic
class, note that you will also have to change the Lipschitz constant computation (some work has already been started in Altered Quadratic Class #245 )
Yes. you are right; for Lipschitz computation, I will have to scale the computed values by sample weights.
- 2 Then, before touching the estimator, I would check that the created datafit works a penalty (L1, MCP) with
AndersonCD
, like done in this example https://contrib.scikit-learn.org/skglm/auto_examples/plot_sparse_recovery.html#sphx-glr-auto-examples-plot-sparse-recovery-py
Sounds good
- 3 Finally, modify the
MCPRegression
constructor as you saidIMO
sample_weights
should not be passed toAndersonCD
:sample_weights
should be passed to Quadratid, which itself will be passed to the solver.
Agreed, that will be cleaner.
Quick question from my side: for the design, do we want to modify the
Quadratic
class, or to create aWeightedQuadratic
class, which would inherit fromQuadratic
? (I would vote for a new class)
Are there any more changes planned along with sample weights? I don't think the addition of sample weights is going to need a lot of changes to the existing class. If not, I would vote to just modify the existing class.
My take: To support both classic Quadratic and weighted Quadratic in one class I see only two options:
- have a
Quadratic.weights
attribute, which isNone
if no weights are used and anp.array
of shape(n_samples,)
otherwise. We keep the existing behavior ifself.weights is None
. This is implemented in #245, but unfortunately I don't think this mixed typing (None or np.array) is supported by numba, see the test failure here : https://github.com/scikit-learn-contrib/skglm/actions/runs/8631223494/job/23660827782?pr=245#step:6:375 - have a
Quadratic.weights
attribute, which isnp.ones(n_samples)
if there are no weights, and a np.array of shape (n_samples,) otherwise. I'm afraid this would slow down the code in the unweighted case; I also don't see how to set the weights to 1 by default (without using a mutable default parameter value, which is dangerous)
So it's probably easier to go for two separate classes, Quadratic and WeightedQuadratic. This should make the PR straightforward.
@mathurinm and @QB3, In this case, we can go for separate classes. I wasn't aware of this numba issue. I have a question, The changes in #245 and this issue will clash. Can we keep just one of them? Or should I wait until the changes in #245 are complete?
#245 is stalled, you can start a separate PR.
You can even start from it, to reuse existing work. To do so first fetch the PR branch, then switch to a new branch:
git fetch [email protected]:scikit-learn-contrib/skglm pull/245/head:nbb
git switch nbb
git switch -c your_desired_branch_name
and start working as usual on a branch.
@sujay-pandit are you working on this ? Some people asked for it offline so if you have not started already, I will implement it. But if you're working on this i'm happy to give feedback
@mathurinm I will push my changes by Thursday EOD. If I am not able to I will hand it over to you guys. Apologies for the delay.
Hi @mathurinm @kmedved @QB3 I have added a new class 'WeightedQuadratic' in datafits/single_task.py and created a PR. I have tested it with examples/plot_sparse_recovery.py and it seems to work fine (with both all samples weighted equally to 1 and with weights generated using the DenseWeight library). I haven't added the changes to MCPRegression in estimators.py. If these changes are ok then I can move on to them. Waiting for your feedback, Thanks!
This was implemented in #258 through a WeightedQuadratic
class :rocket:
@sujay-pandit thanks one more time for the work, we look forward to hearing some feedback from you. Please spread the word about skglm too !