pyod icon indicating copy to clipboard operation
pyod copied to clipboard

COPOD mixes train set and test set

Open mbongaerts opened this issue 2 years ago • 1 comments

Dear contributor(s),

I have a question with respect to the implementation of the COPOD algorithm.

It seems like (if I am not mistaken) that this implementation is mixing train set and test set, when decision_function(X) is called. So, during fitting the train set is nicely stored: https://github.com/yzhao062/pyod/blob/7aeefcf65ceb0196434b7adb4fd706bfb404e4e2/pyod/models/copod.py#L121

When a test set in used to obtain the outlier scores, X_train gets concatenated: https://github.com/yzhao062/pyod/blob/7aeefcf65ceb0196434b7adb4fd706bfb404e4e2/pyod/models/copod.py#L143

In the next steps, it looks like previously fitted parameters (when calling fit()) are overwritten by newly obtained parameters based on the concatenated X (train set+ test set): https://github.com/yzhao062/pyod/blob/7aeefcf65ceb0196434b7adb4fd706bfb404e4e2/pyod/models/copod.py#L125-L155

This behavior seems to be wrong, since now test set and train set are not nicely separated, which in general should be the case. I would be happy to receive some clarification about his.

Kind regards

mbongaerts avatar Apr 28 '22 10:04 mbongaerts

Similar to my comment in https://github.com/yzhao062/pyod/issues/395 I would suggest to change the docsstring: https://github.com/yzhao062/pyod/blob/7aeefcf65ceb0196434b7adb4fd706bfb404e4e2/pyod/models/copod.py#L126

where 'fitted detector' should be removed, since this could mislead the user by thinking learning parameters were previously learned from the train set.

Another question/concerns involves the following line: https://github.com/yzhao062/pyod/blob/7aeefcf65ceb0196434b7adb4fd706bfb404e4e2/pyod/models/copod.py#L143

What will be the behavior of this method, when you pass X_train also via decision_function()? In other words, you concatenate the train set twice, which results in duplicated rows/samples. I am note sure if this is a type of behavior you want to allow. I am happy to hear from you.

Kind regards.

mbongaerts avatar Apr 29 '22 09:04 mbongaerts