pyod
pyod copied to clipboard
Different results with different pandas versions
Hello, I built a cblof model with default parameters and I ran it with pandas 1.3.4 and pandas 2.0.0,I am getting different scores in both the versions.please suggest me regarding this.
I believe it is not about pandas but line 15 in https://github.com/yzhao062/pyod/blob/master/pyod/models/cblof.py Kmean clustering results will change every time.
Hello, Thanks for the reply,will the scores be same if the random_state is same in cblof,I am different scores in iforest and ocsvm as well.I am using the pyod version 1.0.0,is there something I am doing wrong,please suggest me regarding this.Thanks.
that is what I assume. iforest also faces the randomness. if you fix the random state, then it will be fine.
I kept the random_state value in both iforest and cblof but still I am getting different values.is there any hyper parameter in ocsvm like random_state to make sure that it will give same results every time,sorry for the trailing questions and thanks for the response.
Do I understand correctly: While keeping the random state the same for forest and cblof, you still get different results if you use pandas 2.0 and pandas 1.3.4?
OCSVM usually should not face randomness.
Can you provide a minimal example, maybe creating some small arbitrary dataset, where your results differ with pandas 2.0 and pandas 1.3.4? Maybe you do some data preprocessing before handing the data to PyOD which changes the result. That would help in order to troubleshoot your problem.
We are actually seeing a failure when using pandas 2.x vs 1.5.3. ECOD builds a model fine, but with pandas 2.0.x, prediction on the same data that was used for training results in
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend) non-precise type array(pyobject, 2d, C) During: typing of argument at venv39/lib/python3.9/site-packages/pyod/utils/stat_models.py (248)
File "../venv39/lib/python3.9/site-packages/pyod/utils/stat_models.py", line 248: def ecdf_terminate_equals_inplace(matrix: np.ndarray, probabilities: np.ndarray):
With pandas 1.5.3, prediction works without error.