astooke
astooke
Yes, sorry this is an uninformative error I sometimes run into. It usually means there is something wrong inside the subprocess which is generating the example action, observation, etc... One...
Hi, and thanks! :) It should all be in `params.pkl`, which will have the agent state dict, inside of which is the model state dict. (To save that, set the...
Ohh, good point! The epsilon might not be loaded...an oversight on my part. You could save epsilon to the agent's state dict and load that back up. Or just when...
OK good to note! I'll try running everything again with the latest PyTorch and then update the conda env yaml if everything runs. I'm only in Ubuntu tho.
Nice, note, thanks for that! In a related note, we've recently pushed an update which includes observation normalization in the policy gradient algorithms: Commit 98fefa2d8550bddbdff8f44004062dc5e72bf56b So a question for reward...
@vzhuang Curious if you ended up getting this running? Definitely a useful piece to add. :)
Hi, interesting question! One challenge to this is that memory is pre-allocated for the observations and actions, according to the sampler batch size. So it can't have variable-sized observations or...
> where is this memory pre-allocated and where could I modify it? Sure! Here it is in the serial sampler: https://github.com/astooke/rlpyt/blob/75e96cda433626868fd2a30058be67b99bbad810/rlpyt/samplers/serial/sampler.py#L36 Otherwise look for `build_samples_buffer()` and see inside that. >...
@tarungog Hi! Curious if you pursued anything for variable-sized observations and actions?
Hmm, I'm not familiar with that error message. Are the worker processes even able to fork?