ProgLearn
ProgLearn copied to clipboard
Bias in UncertaintyForest performance compared to paper
My issue is about the fact that the UncertaintyForest benchmarks notebook shows that the UncertaintyForest class from ProgLearn underperforms IRF at d=20, which we did not see in the original paper.
I checked that samples are taken without replacement now in both the deprecated uncertainty-forest repo and in ProgLearn, i.e. bootstrap = False in the figure 2 tutorial in the uncertainty-forest repo, and replace = False in progressive-learner.py in ProgLearn. Also, I believe that the n_estimators (300), tree_construction_proportion (0.4), and kappa (3) values are the same.
Snapshot of documentation error:
From the paper (original Figure 2):

From benchmarks in EYezerets/ProgLearn on the fig2benchmark branch:

Additional context
Sorry, for some reason I'm not able to assign Richard to this issue. Could someone please help me include him in this conversation?
@EYezerets as Richard is not a contributor he has to comment on this issue to be assigned.
@rguo123 i can't assign you, so i slacked you ;)
@EYezerets Has there been enough progress on this to close this issue?
@levinwil Sorry Will, we haven't really had any new ideas on this recently. Is it impeding progress on the repo?
@EYezerets One difference I found between the original notebook and the new benchmark functions is that in the estimate_ce function, the original UF never took out the 0.3 testing data for evaluation. So the original UF might perform better because it used all the training data? Which makes the original figures more "biased?"
To put it more specifically, the original UF has the |DP| : |DV| : |DE| ratio as 0.4 : 0.3 : 0.3, but the new benchmark functions has 0.4*0.7 : 0.6*0.7 : 0.3 = 0.28 : 0.42 : 0.3. And the 0.3 evaluation dataset is the same for all trees, differing from the original implementation.