matminer
matminer copied to clipboard
Missing compatibility with pandas v2
In the current state of master
we are allowing people to install matminer with pandas v2, yet many featurizers do not work in this state. I made as much progress as I can in #912 but cannot justify spending any time on this myself right now. This should be addressed before the next release.
If only to add a public voice that this is an issue...
I have also found that this dependency makes matminer difficult to use alongside other modern packages.
Proposed solution: deprecate the featurizers that are in some sense "broken" by this pandas upgrade and be more strict in our pinning of dependencies (in order to match the current sustainable release cadence of this package without dedicated maintainers) in preparation for a v0.10.0 release.
In the future, when things break due to upstream packages, people are more than welcome to submit fixes (and I am happy to review/merge/release) but I cannot sign up for the ongoing maintenance of keeping deps up to date.
Any opinions? @ardunn @tschaume ? I guess it could break subsequent code. I would love to keep on using matminer but in it's current state that's probably hard to do.
@JaGeo matminer used to have a pin on pandas~=1.5
which is keeping MP from staying up-to-date with pandas (This would be true with any upward pins that matminer
enforces in its dependencies). I use present tense here because there hasn't been a release of matminer since the pandas requirement was removed in https://github.com/hackingmaterials/matminer/commit/237603cdf5bd43675af45109ee3b7baf904cdda9. As @ml-evs mentioned, we're waiting for 0.10.0 to be released. We're also looking into how matminer
enters MP's dependency stack as a required dependency and how we could make that dependency optional.
I agree with @ml-evs that featurizers that don't support recent versions of pandas should probably be deprecated. Alternatively, they could run if a compatible version of pandas happens to be installed and throw a warning otherwise.
HTH
If it's not that urgent then can we please just reintroduce the pandas pin so that this package actually works for the majority of its userbase? i.e. people who don't mind having a separate virtualenv for matminer do its featurising. I can't see what MP are using it for in any of the open repos but I'm sure it would be less overhead for everyone if you could just vendor the bit you need (or contribute back pandas V2 support yourselves, as it seems like no-one needs it enough to implement it).
We'll have exactly the same problem with pandas 3 soon too.
As an aside, it's likely that it's actually numpy that's causing most of the problems, a side effect of bumping pandas
@ml-evs We were able to remove the matminer dependency from the MP stack entirely (other than our builders that depend on robocrys
). Feel free to manage matminer's pandas dependency as you, @ardunn and @computron think is best for your user community.
I think I've resolved the remaining issues with numpy 1.24+ support in #925, after merging I plan to do a v0.9.1 release unless there are any objections. Hopefully it will then be easier to upgrade pandas etc in the future.
Release is made -- if people run into issues then we can reopen this and consider steps to support pandas v2 (it may not be as difficult as described above, as numpy compatibility has now been fixed [at least as far as it is tested]).
Right, #929 seemed pretty painless so I've just done an immediate follow-up release that enables pandas v2 for those that want it. Hopefully this ends the saga (until the next one ;)).
@ml-evs awesome work. Thank you!