ngx-toastr icon indicating copy to clipboard operation
ngx-toastr copied to clipboard

Improve NAF calculation and fix info processing

Open evhub opened this issue 6 years ago • 1 comments

Two changes with this PR:

  1. Fixes an error in Agent.fit that prevented the use of numpy arrays as values in the info dictionary. Specifically, np.isreal returns an array when called on an array, which cannot then be checked for truthity unless np.all is called on it.

  2. Improves the NAF calculation to reduce the amount of exponentiation necessary by multiplying by a diagonal mask both before and after exponentiation.

evhub avatar Dec 13 '18 22:12 evhub

Now also fixes an error processing strange info dictionaries from gym environments.

evhub avatar Dec 14 '18 23:12 evhub