skglm
skglm copied to clipboard
ENH - check ``datafit + penalty`` compatibility with solver
A quick proof-of-concept of a function that checks if the combination (solver, datafit, penalty) is supported. Currently we have some edge cases where one can pass ProxNewton solver with L0_5 penalty without any error being raised.
Pros of this design: the validation rules are centralized and validating a 3-uple is a one-liner in glm_fit.
Cons: we have to update the rules as we enhance the capabilities of the solver.
All in all, I think it is very valuable to have more verbose errors when fitting estimators (e.g. Ali Rahimi initially passed a combination Quadratic, L2_3, ProxNewton which cannot be optimized at the moment of writing).
Closes #101 Closes #90 Closes #109
With this PR, the errors are more verbose:
In [1]: from skglm.estimators import GeneralizedLinearEstimator
from skglm.penalties import L0_5
from skglm.datafits import Quadratic, Logistic
from skglm.solvers import ProxNewton, AndersonCD
import numpy as np
In [2]: X = np.random.normal(0, 1, (30, 50))
y = np.random.normal(0, 1, (30,))
In [3]: clf = GeneralizedLinearEstimator(Quadratic(), L0_5(1.), ProxNewton())
In [4]: clf.fit(X, y)
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 clf.fit(X, y)
File ~/Documents/skglm/skglm/estimators.py:241, in GeneralizedLinearEstimator.fit(self, X, y)
238 self.datafit = self.datafit if self.datafit else Quadratic()
239 self.solver = self.solver if self.solver else AndersonCD()
--> 241 return _glm_fit(X, y, self, self.datafit, self.penalty, self.solver)
File ~/Documents/skglm/skglm/estimators.py:29, in _glm_fit(X, y, model, datafit, penalty, solver)
27 is_classif = isinstance(datafit, (Logistic, QuadraticSVC))
28 fit_intercept = solver.fit_intercept
---> 29 validate_solver(solver, datafit, penalty)
31 if is_classif:
32 check_classification_targets(y)
File ~/Documents/skglm/skglm/utils/dispatcher.py:21, in validate_solver(solver, datafit, penalty)
6 """Ensure the solver is suited for the `datafit` + `penalty` problem.
7
8 Parameters
(...)
17 Penalty.
18 """
19 if (isinstance(solver, ProxNewton)
20 and not set(("raw_grad", "raw_hessian")) <= set(dir(datafit))):
---> 21 raise Exception(
22 f"ProwNewton cannot optimize {datafit.__class__.__name__}, since `raw_grad`"
23 " and `raw_hessian` are not implemented.")
24 if ("ws_strategy" in dir(solver) and solver.ws_strategy == "subdiff"
25 and isinstance(penalty, (L0_5, L2_3))):
26 raise Exception(
27 "ws_strategy=`subdiff` is not available for Lp penalties (p < 1). "
28 "Set ws_strategy to `fixpoint`.")
Exception: ProwNewton cannot optimize Quadratic, since `raw_grad` and `raw_hessian` are not implemented.
Looks nice @PABannier, this will definitely improve UX!
From an API point of view, shouldn't this check be delegated to each solver? This way we don't have one big function, but Solver.validate(datafit, penalty), in the spirit of what @Badr-MOUFAD implemented here : https://github.com/scikit-learn-contrib/skglm/blob/main/skglm/experimental/pdcd_ws.py#L201
Such functions could also take care of the initialization (e.g. stepsize computation) which is done on a solver basis. WDYT?
@mathurinm Yes I think it's cleaner, currently refining the POC.
This would be a nice addition if we can ship it in the 0.3 release @Badr-MOUFAD , given that we added a few datafits, penalties and solvers !
@Badr-MOUFAD the issue popped up in #188, do you have time to take this over ? A simple check, at the beginning of each solver, that the datafit and penalty are supported (eg AndersonCD does not support Gamma datafit)
@Badr-MOUFAD the issue popped up in https://github.com/scikit-learn-contrib/skglm/issues/188, do you have time to take this over ? A simple check, at the beginning of each solver, that the datafit and penalty are supported (eg AndersonCD does not support Gamma datafit)
Sure, I will resume this PR.
Requires #191 to be implemented fit to allow for better checks
Trying to revive this to release v0.4. Some comments for discussion @QB3 @Badr-MOUFAD
- I feel strongly against having both
solver.solve()andsolver(). "There should be one-- and preferably only one --obvious way to do it". The way I see it, allsolvemethods should be renamed_solve, andBaseSolvershould implement itssolvethat does the checks then call_solve. What do you think? This is non-breaking from the API point of view, just requires people implementing their own solver to adapt (I have no such example in mind ; we can implement a test that no Solver implements solve by checking that A.solve == BaseSolver.solve)BaseSolver.solvecan have arun_checksmethod which is True by default, and disabled if one wants to be faster (any idea on the impact of the checks @Badr-MOUFAD ?) - I feel like checking for the sparse support is heavy at the moment and could be done in another PR, but that may just be me :)
- I have mixed feelings for the name
check_obj_solver_attr; see my comment, I'd go forcheck_attrand not passsolver, which is used only for its name. custom_compatibility_checkcould becustom_checks(?)
- I’m +1 with having
_solvemethod and addingrun_checksargument tosolvemethod. I feel it give us more freedom to standardize the behavior and make less verbose thesolvemethod. I don’t think the check have a big overhead, though I didn’t check that in practice - The support of sparse data is covered in
check_obj_solver_attrfunction, so I don’t think it hurts us to cover it in this PR - I have no strong opion about the names, I’m +1 with your proposed name @mathurinm
I think the error messages help a lot. Suggestion: add datafit/penalty and sovler
from
AttributeError: Poisson is not compatible with AndersonCD. It must implement `get_lipschitz` and `gradient_scalar`
Missing `get_lipschitz`.
to
AttributeError: Datafit `Poisson` is not compatible with solver `AndersonCD`. It must implement `get_lipschitz` and `gradient_scalar`
Missing `get_lipschitz`.
EDIT: I implemented it for solver ; to distinguis between if obj is a df or a pen we can have two arguments in check_obj_solver_attr equal to None by default, and infer from here:
def check_obj_solver_attr(solver, required_attr, df=None, pen=None,support_sparse=False):
if df is None:
name = "penalty"
elif pen is None:
name = "datafit"
else:
raise Value Error ("Can't pass both a penalty and a datafit")
- I’m +1 with having
_solvemethod and addingrun_checksargument tosolvemethod. I feel it give us more freedom to standardize the behavior and make less verbose thesolvemethod. I don’t think the check have a big overhead, though I didn’t check that in practice
Let's perform checks all the time for now, this will simplify our lives. WDYT?
- The support of sparse data is covered in
check_obj_solver_attrfunction, so I don’t think it hurts us to cover it in this PR
OK
- I have no strong opion about the names, I’m +1 with your proposed name @mathurinm
@QB3 any opinion?