causalml
causalml copied to clipboard
Casualml pickle object failing to load ValueError: numpy.ndarray size changed
Hi,
We have just recently seen our code failing due to the error message noted below:
build 22-Feb-2021 10:15:19 .tox/py37/lib/python3.7/site-packages/causalml/inference/tree/init.py:3: in
We are using Poetry for our python library package management and we haven't changed any of our library versions. We also haven't changed the pickled model that we are trying to load.
We are using the following library versions which should be compatible with causalml:
numpy: 1.18.5 scipy: 1.4.1 pandas: 1.1.5 scikit-learn: 0.22.2 cython: 0.29.21 tensorflow: 2.4.1
These library versions haven't changed between our successful running of the code and it randomly failing now.
Any advice or help you could give us in root causing this issue would be appreciated.
Vicky
I have seen the same error trying to import CausalMSE / CausalTreeRegressor on CI in python ~~3.7~~ [edit: 3.8.8] (installed through pip)
numpy: 1.18.5 scipy: 1.4.1 pandas: 1.0.5 scikit-learn: 0.23.2 cython: 0.29.22 tensorflow: 2.3.2
edit: I have also reproduced this on a clean 3.8.5 environment with the above library versions
edit: I played around with new conda environments against python 3.7.4, 3.7.9, 3.8.5, and 3.8.8 and various combinations of dependencies but I got mixed results and I'm no longer confident where the issue is.
Ok thanks for looking into this.
I don't know if it helps but we were using python 3.7.3 and the dependencies were being installed by pip.
We were hitting the issue when trying to run our unit tests via Tox which should have been creating a clean 3.7.3 environment with all the correct dependencies. However, we were able to get around the issue by using poetry to create the environment instead and then just calling pytest on the tests.
I still don't understand the root cause of the issue but we think it was related to how Tox was setting up the python 3.7.3 environment with the dependencies.
I'm suspicious of https://github.com/uber/causalml/blob/master/setup.py#L12 where it will get numpy if the machine doesn't have it as part of setup.py, but without a version being specified. Cython is also not set to a version in setup.py like it is in requirements.txt. I'm having trouble building from source on my machine so I can't quite confirm it, I'll keep working on it. If I can get it working and find that setting versions in setup.py for numpy and cython fixes the issue I'll submit a PR.
To update this issue as well, in our environment this was solved by restricting the versions in setup.py to match what is in requirements.txt. After reading about the error with size changing, it seems like a newer numpy created some object that an older numpy (<1.19) tried to load. I'm guessing some object was created during the run of setup.py, which grabbed the latest numpy instead of <1.19, and then later when the older numpy tries to load that object we see this error.
I could be completely wrong, thats just a hypothesis, but at least the addition of versions in setup.py fixed the issue in our environment.