Ian Lin
Ian Lin
linucb is also not, but we modify it to make it do it...
The current implementation of exp3 also doesn't support delay reward
#108 is the solution for exp3
How do we define T?
I think we should fix N. What happens if we have more than T rounds?
I think the actions and experts should both be fixed... I don't think Exp4.P can handle changes of actions and experts reasonably... This is a big change, any idea? @yangarbiter...
After retraining the experts, I don't think the weight can still work, and the new weight of a new action is also a problem.
@taweihuang
I think this only transform `query_vector` to be `np.ndarray`?
It transform `query_vector` to `ndarray` because every values in `query_vector` is `np.float64`, so that division make `query_vector` to be `ndarray`. It's quite unexpected.