Rita Laezza
Rita Laezza
Hey, how is the progress on this feature? I was trying to see if I could implement this around **rlpyt** myself, but it isn't very straightforward. Maybe a good source...
If nobody else has the time, I guess I can give it a try. I have already been using rlpyt for goal-based environments.
What if I want to use the saved model state dictionary (in params.pkl) to sample actions? The goal is to have that in a loop and run `env.step(action)` with the...
@kaixin96 Thank you for the tip. I had already made another workaround... As I had guessed, creating an instance of the torch model used for the agents' policy and then...
Hello, Thank you for a speedy reply. I guess you just answered my question. If stable-baselines3 requires `compute_reward()` to be vectorized (taking a batch of goals as input), then this...
> We require that to have a fast implementation. I will see if I can change the reward function so that it can be vectorized. > hmm, but this is...