Different Results
Hello,
I was told to use this PU algorithm for a research project so I copy-and-pasted the code you wrote into my own Python file and ran it. It has been over a year since you wrote this code and the corresponding article on it so some stuff has changed. Speaking of the article, thank you so much for it. When I ran what you wrote, I got warnings about the label encoder in XGBClassifier and the default evaluation metric in XGBoost having changed. Ignoring those warnings, the results I got didn't match what yours.
Classification results: f1: 53.54% roc: 68.28% recall: 36.56% precision: 100.00%
When I rewrote the XGBClassifiers as "xgb.XGBClassifier(eval_metric='error')", my results were a little better.
Classification results: f1: 60.57% roc: 71.72% recall: 43.44% precision: 100.00%
Regardless of what I did with the XGBClassifiers, the baseline results were consistently the same as yours.
Classification results: f1: 99.57% roc: 99.57% recall: 99.15% precision: 100.00%
Do you know why the final classification results I got are different from yours?
Thank you
i got similar results. Using PU -adjusted ML, the classification is a little better than the threshold. i spent so much time on PU methodology which generates all sorts of bugs. I am not going to waste my time on it for my research.
The bagging approach is often more stable and gets better results. Check the bottom part of this notebook for a quick example (under the title "Use bagging and LGBMClassifier") https://github.com/a-agmon/pu-learn/blob/master/PU_Learning_WBagging.ipynb
@mhsultan-1998 @kk-learn
