ARS Divide by zero

Divide by zero

Open pedronahum opened this issue 6 years ago • 4 comments

Hi, First and foremost, thanks for sharing the code. This is greatly appreciated.

Currently testing ARS in other learning environments and found that for very difficult environments the users of the code might face a divide by zero error, particularly at early stages of the learning process (ie, zero reward in all the initial rollouts).

# normalize rewards by their standard deviation
rollout_rewards /= np.std(rollout_rewards)

Thanks,

Mar 28 '18 19:03 pedronahum

I experienced this kind of difficulties in all sparse reward setting. Is ARS a good way to go for these optimization landscapes?

Feb 03 '19 11:02 hari-sikchi

Can we use a .clip(min=1e-2) to avoid that ?

Feb 18 '19 18:02 ashutoshtiwari13

In my case, adding 1e-8 to the divisor made the trick...

Feb 18 '19 18:02 pedronahum

yeah @pedronahum , that would do it too!

Feb 18 '19 18:02 ashutoshtiwari13

ARS ARS copied to clipboard

Divide by zero

ARS
ARS copied to clipboard