glum icon indicating copy to clipboard operation
glum copied to clipboard

glum v3.0 alpha

Open MartinStancsicsQC opened this issue 1 year ago • 3 comments

Checklist

  • [x] Added a CHANGELOG.rst entry

This PR contains the changes needed to make glum compatible with the new major release of tabmat. Branches and PRs that make use of new functionality in Tabmat 4.0 will be based on this gulm-v3 branch.

Related issues: closes #730, closes #728, closes #583.

MartinStancsicsQC avatar Aug 14 '23 12:08 MartinStancsicsQC

Since we're doing a breaking release anyway, would it be possible to rearrange the arguments of coef_table? In particular, to move confidence_level down, closer to robust and friends. In most methods, we're able to pass X and y as the first and second positional arguments, so coef_table keeps catching me out.

lbittarello avatar Feb 12 '24 19:02 lbittarello

In most methods, we're able to pass X and y as the first and second positional arguments, so coef_table keeps catching me out.

I am fine with changing the order in coef_table if we also change the order in wald_test in order to stick with the rule "any method with arguments X and y should have X on first and y on second position." This is slightly weird though because either R or features is required in wald_test, but X and y are optional.

Currently, the order is at least consistent with the rule "optional arguments that are passed to self.covariance() should come after specific arguments." What do you think @lbittarello ?

I agree, If we change covariance_matrix we should change wald_test, too. Putting required arguments after optionals looks weird at first, but it would be possible because they are "technically optional", and their presence is enforced within the function, not in the function signature.

Tbh I don't think the R, features and formula arguments in wald_test should ever be used positionally anyway, as that is quite unclear. What about making those arguments keyword only (and the same with confidence_level in coef_table)? Or, if we are breaking stuff anyways, what about making most of the arguments in GeneralizedLinearRegressor's public methods (i.e. everything other than X and y; maybe also sample_weight) keyword-only? They do not really have a consistent order anyways, and this would give us more flexibility in rearranging stuff in a non-breaking way in the future.

stanmart avatar Feb 14 '24 07:02 stanmart

FYI, CI is currently failing because tabmat 4.0.0 is not available on conda-forge yet.

The reported errors are:
- Encountered problems while solving:
-   - nothing provides requested tabmat >=4.0.0

New CI failure is due to missing libblas >=0 *mkl when solving environment on macos-latest - Py3.12. Same failure occurs on main.

I think they switched macos-latest to being an ARM-based image in the last two days. Thus, you cannot install the Intel-only libblas >=0 *mkl package there.

xhochy avatar Apr 26 '24 11:04 xhochy

I added a separate environment for macos without the libblas >=0 *mkl dependency. This has the additional benefit that it is more prominent to Mac users who might miss the comment in the standard environment.yml IMO.

Nightly failures (or at least most of them) should because of shorthand scipy.sparse.*_matrix.A in the simulations, which is deprecated in future scipy versions, see also here. Will replace them on this branch.

The remaining 34 failures on the nightly builds are due to incompatibilities between upstream packages and pre-release versions of other dependencies, namely:

  • tabmat calling scipy.sparse.*_matrix.A, which will be deprecated in scipy, issue here,
  • formulaic not yet being compatible with the pandas 3.0.0* development version, issue here.

These should be addressed in the upstream packages.

Unless anyone objects, I will merge the PR and release soon.