pyod Different results with different pandas versions

Hello, I built a cblof model with default parameters and I ran it with pandas 1.3.4 and pandas 2.0.0,I am getting different scores in both the versions.please suggest me regarding this.

Apr 24 '23 13:04 wizard1359

I believe it is not about pandas but line 15 in https://github.com/yzhao062/pyod/blob/master/pyod/models/cblof.py Kmean clustering results will change every time.

Apr 24 '23 13:04 yzhao062

Hello, Thanks for the reply,will the scores be same if the random_state is same in cblof,I am different scores in iforest and ocsvm as well.I am using the pyod version 1.0.0,is there something I am doing wrong,please suggest me regarding this.Thanks.

Apr 24 '23 13:04 wizard1359

that is what I assume. iforest also faces the randomness. if you fix the random state, then it will be fine.

Apr 24 '23 13:04 yzhao062

I kept the random_state value in both iforest and cblof but still I am getting different values.is there any hyper parameter in ocsvm like random_state to make sure that it will give same results every time,sorry for the trailing questions and thanks for the response.

Apr 24 '23 14:04 wizard1359

Do I understand correctly: While keeping the random state the same for forest and cblof, you still get different results if you use pandas 2.0 and pandas 1.3.4?

OCSVM usually should not face randomness.

Can you provide a minimal example, maybe creating some small arbitrary dataset, where your results differ with pandas 2.0 and pandas 1.3.4? Maybe you do some data preprocessing before handing the data to PyOD which changes the result. That would help in order to troubleshoot your problem.

May 11 '23 15:05 Lucew

We are actually seeing a failure when using pandas 2.x vs 1.5.3. ECOD builds a model fine, but with pandas 2.0.x, prediction on the same data that was used for training results in

numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend) non-precise type array(pyobject, 2d, C) During: typing of argument at venv39/lib/python3.9/site-packages/pyod/utils/stat_models.py (248)

File "../venv39/lib/python3.9/site-packages/pyod/utils/stat_models.py", line 248: def ecdf_terminate_equals_inplace(matrix: np.ndarray, probabilities: np.ndarray):

""" for cx in range(probabilities.shape[1]): ^

With pandas 1.5.3, prediction works without error.

Jul 18 '23 17:07 rdriskill1234

pyod pyod copied to clipboard

Different results with different pandas versions

pyod
pyod copied to clipboard