glum
glum copied to clipboard
glum v3.0 alpha
Checklist
- [x] Added a
CHANGELOG.rst
entry
This PR contains the changes needed to make glum compatible with the new major release of tabmat. Branches and PRs that make use of new functionality in Tabmat 4.0 will be based on this gulm-v3
branch.
Related issues: closes #730, closes #728, closes #583.
Since we're doing a breaking release anyway, would it be possible to rearrange the arguments of coef_table
? In particular, to move confidence_level
down, closer to robust
and friends. In most methods, we're able to pass X
and y
as the first and second positional arguments, so coef_table
keeps catching me out.
In most methods, we're able to pass X and y as the first and second positional arguments, so coef_table keeps catching me out.
I am fine with changing the order in coef_table
if we also change the order in wald_test
in order to stick with the rule "any method with arguments X
and y
should have X
on first and y
on second position." This is slightly weird though because either R
or features
is required in wald_test
, but X
and y
are optional.
Currently, the order is at least consistent with the rule "optional arguments that are passed to self.covariance()
should come after specific arguments." What do you think @lbittarello ?
I agree, If we change covariance_matrix
we should change wald_test
, too. Putting required arguments after optionals looks weird at first, but it would be possible because they are "technically optional", and their presence is enforced within the function, not in the function signature.
Tbh I don't think the R
, features
and formula
arguments in wald_test
should ever be used positionally anyway, as that is quite unclear. What about making those arguments keyword only (and the same with confidence_level
in coef_table
)? Or, if we are breaking stuff anyways, what about making most of the arguments in GeneralizedLinearRegressor
's public methods (i.e. everything other than X
and y
; maybe also sample_weight
) keyword-only? They do not really have a consistent order anyways, and this would give us more flexibility in rearranging stuff in a non-breaking way in the future.
FYI, CI is currently failing because tabmat 4.0.0 is not available on conda-forge yet.
The reported errors are:
- Encountered problems while solving:
- - nothing provides requested tabmat >=4.0.0
New CI failure is due to missing libblas >=0 *mkl
when solving environment on macos-latest - Py3.12. Same failure occurs on main.
I think they switched macos-latest
to being an ARM-based image in the last two days. Thus, you cannot install the Intel-only libblas >=0 *mkl
package there.
I added a separate environment for macos without the libblas >=0 *mkl
dependency. This has the additional benefit that it is more prominent to Mac users who might miss the comment in the standard environment.yml
IMO.
Nightly failures (or at least most of them) should because of shorthand scipy.sparse.*_matrix.A
in the simulations, which is deprecated in future scipy versions, see also here. Will replace them on this branch.
The remaining 34 failures on the nightly builds are due to incompatibilities between upstream packages and pre-release versions of other dependencies, namely:
- tabmat calling
scipy.sparse.*_matrix.A
, which will be deprecated in scipy, issue here, - formulaic not yet being compatible with the pandas 3.0.0* development version, issue here.
These should be addressed in the upstream packages.
Unless anyone objects, I will merge the PR and release soon.