imitation icon indicating copy to clipboard operation
imitation copied to clipboard

Adding active selection support using epistemic uncertainty for reward ensemble (#462)

Open taufeeque9 opened this issue 3 years ago • 1 comments

This pull request adds the ActiveSelectionFragmenter class for active learning with three supported uncertainty variants - logit, probability, and label. It also refactors CrossEntropyRewardLoss by creating the PreferencePredictor class that can be wrapped on RewardNet to create a model that predicts the preference probability given a fragment pair.

taufeeque9 avatar Jul 20 '22 23:07 taufeeque9

Codecov Report

Merging #482 (9cd4559) into master (45232b7) will increase coverage by 0.06%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #482      +/-   ##
==========================================
+ Coverage   96.88%   96.94%   +0.06%     
==========================================
  Files          84       84              
  Lines        7278     7421     +143     
==========================================
+ Hits         7051     7194     +143     
  Misses        227      227              
Impacted Files Coverage Δ
src/imitation/scripts/common/reward.py 98.64% <ø> (ø)
src/imitation/algorithms/preference_comparisons.py 99.17% <100.00%> (+0.19%) :arrow_up:
...ion/scripts/config/train_preference_comparisons.py 85.33% <100.00%> (+0.61%) :arrow_up:
.../imitation/scripts/train_preference_comparisons.py 98.36% <100.00%> (+0.08%) :arrow_up:
tests/algorithms/test_preference_comparisons.py 100.00% <100.00%> (ø)
tests/scripts/test_scripts.py 100.00% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Jul 21 '22 00:07 codecov[bot]

CodeCov is passing now after the merge (which I screwed up but fixed). I think CodeCov sometimes gets confused when the base isn't the most recent version of master.

AdamGleave avatar Aug 23 '22 04:08 AdamGleave