Somshubra Majumdar
Somshubra Majumdar
Scipy Odeint uses other solvers than dopri5 underneath the hood, written in Fortran and built to switch between still and non stiff solvers. So it's general performance is excellent. You...
With k2 = 2000., you are presenting stiff ODE. Solves like VODE and CVODE (inside scipy) can handle such stiff equations by performing a backward solve through the ODE rather...
So I'll preface this with two things, one that this would be my first foray into RL and that the paper was woefully inadequate in implementation details. 1) My understanding...
Hmm the comments on (3) and (4) are quite interesting. I think in this code, if each architecture starts with a zero state in the beginning, and I don't sample...
``` for t in reversed(range(0, rewards.size)): if rewards[t] != 0: running_add = 0 running_add = running_add * self.discount_factor + rewards[t] discounted_rewards[t] = running_add return discounted_rewards[-1] ``` This us the discounted...
Hmm. When I was implementing this, many of the details were not a available in the paper so I had to come up with reasonable defaults. Ofc, those may have...
I don't really understand by what you mean as the "whole" state. Keras, and an RNN in general, will only accept a state size of shape (batch size, state size)...
Yes that's a way of priming the rnn states for prediction and can be done. I haven't implemented that, but if you so wish, you can send a PR and...
The findings match the paper. The performance of batch renorm is just slightly higher than batchnorm. Renorm is only faster to get to that high score than batchnorm.
Hmm, I did try with batch size of 4 and 8, and the results were similar to that with larger batch sizes. Batch renorm quickly hits a high score whereas...