terryfrankcombe
terryfrankcombe
My progress is slow. Inside my torchfort_m.F90 I have extended the `torchfort_rl_off_policy_predict` interface with a version `torchfort_rl_off_policy_predict_double_2d_1d` that takes a (real64) 2D environment and seeks a (real64) 1D action. `torchfort_rl_off_policy_predict_c`...
I'm a numpty, I changed bits of the code and forgot other bits. :-/ So I've simplified again. My state contains 32768 elements, and my action 6. Now I have...
Thanks @romerojosh for your excellent reply!
I'm using SAC because I have a one-to-many problem and I want the model to return diverse actions, so the stochastic nature of SAC is attractive. I can quickly evaluate...
@romerojosh, can I return to this for a moment: > Looking at the error in your most recent comment, the issue is that the input state vectors in our default...