GLM.jl icon indicating copy to clipboard operation
GLM.jl copied to clipboard

Version 2.0 Breaking Changes

Open palday opened this issue 2 years ago • 5 comments

I've noticed that some packages rely on TableRegressionModel to support GLM: https://github.com/jmboehm/RegressionTables.jl/issues/128, https://github.com/yufongpeng/AnovaBase.jl/issues/52 and https://github.com/yufongpeng/AnovaGLM.jl/issues/6. Even if they adapt to support the new approach, we'd better bump version to 2.0 to avoid any breakage. That can also be the occasion to drop some long-deprecated API. We should check whether we would like to make any other breaking changes. (A few other packages use TableRegressionModel for their own models, it would be good that they also stop using it but there's no hurry.)

Originally posted by @nalimilan in https://github.com/JuliaStats/GLM.jl/issues/339#issuecomment-1242947968

Here's a quick list of potentially issues that we might want to try to address as part of a push towards 2.0. Several are relatively straightforward, some could potentially be solved via more extensive documentation, and some will require Decisions to be made (e.g. all the stuff with weights).

  • [x] #339
  • [ ] remove deprecations (search the source for deprecate to catch "manual" deprecation warnings)
  • [x] all the issues related to handling of rank deficiency / multicollinearity (#449, #426, #413, #375, #280) because this involves potentially changing defaults
  • [ ] potentially exposing a way for the user to choose between QR/Cholesky with the formula interface? Maybe even defaulting to the slightly slower but more stable QR method?
  • [ ] #483
  • [ ] #487
  • [ ] #350
  • [ ] #259
  • [ ] #255
  • [ ] #240
  • [ ] drop support for Julia < 1.6 and strip out all the associated tests for output in those versions
  • [ ] make internal fieldnames more transparent or at least add some comments to the struct definitions

There are several other issues I would like to see addressed sooner rather than later, but all are technically nonbreaking, at least under ColPrac guidelines (e.g., changes to the show methods, as raised in #461 and #469).

palday avatar Sep 29 '22 04:09 palday

Right now, we are working on GLM with QR decomposition in two steps

  1. LM with QR
  2. GLM with QR and target is to complete by this calendar year.

Hope this will solve some issues related to the PosDefException as mentioned above.

mousum-github avatar Oct 10 '22 04:10 mousum-github

I would like to have Multiple dependent variables, and Quasi Likelihood in GLM 2.0

mousum-github avatar Oct 10 '22 04:10 mousum-github

Nice to hear you're working on QR! I think we can wait until you finish that before tagging 2.0. OTC, multiple dependent variables and quasi-likelihood do not change current behavior so they can be added later (and we have to discuss whether they should live in this package or in a separate one).

nalimilan avatar Oct 10 '22 06:10 nalimilan

I don't think we should do anything about https://github.com/JuliaStats/GLM.jl/issues/259. Anyway https://github.com/JuliaStats/GLM.jl/pull/487 will change nobs to return an integer, as now the presence of weights is part of the type so there's no type instability. People can use size(modelmatrix(m), 1) to find out the number of rows in the matrix if they need that information.

#483, https://github.com/JuliaStats/GLM.jl/issues/255 and https://github.com/JuliaStats/GLM.jl/issues/240 would be good to have, but not breaking AFAICT.

nalimilan avatar Oct 21 '22 07:10 nalimilan

I hope 2.0 fixes https://github.com/JuliaStats/GLM.jl/issues/496 and throws an error on missing values to protect users from making analytical errors by accident. lm(...; skipmissing=true) seems fine to me.

jariji avatar Jan 05 '23 19:01 jariji