bandit_simulations icon indicating copy to clipboard operation
bandit_simulations copied to clipboard

The lower alpha I set , the higher clickrate it return

Open hcygeorge opened this issue 3 years ago • 2 comments

Hi, I have tried your linUCB disjoint implementation, and I found that the lower alpha I set , the higher ctr rate it return.

When alpha = 0.01, the cumulate click rate almost converge to 0.9. I guess something wrongs with the dataset since lower alpha means it nearly give up exploration.

Any idea of how this happened and how to fix it?

hcygeorge avatar Apr 15 '21 08:04 hcygeorge

Hi @hcygeorge, sorry for the late reply and it's been a long time since I re-visited this repo.

My hypothesis is that the dataset was created with a simulation of having 10 different contextual arms (thus a data generation process originating with 10 different contextual arms with very little "noise"). I also suspect that with the ridge regression methodology, the model was able to easily find the best arm for each contexts easily, and thus, having a lower alpha means less exploration of other arms and more exploitation of arms that have already been doing well.

In doing so, it allowed the system to reach a CTR of 0.90 for a similar range of timesteps (I am presuming that your run was about 800 to 1100 steps).

Let me know if that clarifies!

kfoofw avatar Jul 22 '21 06:07 kfoofw

I actually found the dataset that was part of a Homework assignment of a ML class in Columbia University. For your reference, here's the link to the homework assignment.

http://www.cs.columbia.edu/~jebara/6998/hw2.pdf

kfoofw avatar Jul 22 '21 06:07 kfoofw