ARS icon indicating copy to clipboard operation
ARS copied to clipboard

Divide by zero

Open pedronahum opened this issue 6 years ago • 4 comments

Hi, First and foremost, thanks for sharing the code. This is greatly appreciated.

Currently testing ARS in other learning environments and found that for very difficult environments the users of the code might face a divide by zero error, particularly at early stages of the learning process (ie, zero reward in all the initial rollouts).

# normalize rewards by their standard deviation
rollout_rewards /= np.std(rollout_rewards)

Thanks,

pedronahum avatar Mar 28 '18 19:03 pedronahum

I experienced this kind of difficulties in all sparse reward setting. Is ARS a good way to go for these optimization landscapes?

hari-sikchi avatar Feb 03 '19 11:02 hari-sikchi

Can we use a .clip(min=1e-2) to avoid that ?

ashutoshtiwari13 avatar Feb 18 '19 18:02 ashutoshtiwari13

In my case, adding 1e-8 to the divisor made the trick...

pedronahum avatar Feb 18 '19 18:02 pedronahum

yeah @pedronahum , that would do it too!

ashutoshtiwari13 avatar Feb 18 '19 18:02 ashutoshtiwari13