striatum icon indicating copy to clipboard operation
striatum copied to clipboard

Exp4.P cannot handle delay reward

Open ianlini opened this issue 8 years ago • 4 comments

Maybe we should use the score in history, but not store it in model.

ianlini avatar Sep 12 '16 07:09 ianlini

This algorithm is originally not designed for delayed reward.

yangarbiter avatar Sep 12 '16 08:09 yangarbiter

linucb is also not, but we modify it to make it do it...

ianlini avatar Sep 12 '16 09:09 ianlini

The current implementation of exp3 also doesn't support delay reward

ianlini avatar Oct 17 '16 07:10 ianlini

#108 is the solution for exp3

ianlini avatar Oct 28 '16 10:10 ianlini