Rajkumar Ramamurthy
Rajkumar Ramamurthy
@Miffyli @araffin Thanks for getting back. Just to be clear. We can just introduce a new variable `next_values` for the line `self.policy.predict_values(obs_as_tensor(new_obs, self.device))` and then update the locals in the...
True, I agree using callbacks is kinda hacky. However, I think that is the only option for me if I do not want to fork SB3 because I work on...
Hey, for multi-process training, refer to stable-baselines interface since Nlp-gym does not provide implementations of RL algorithms. Also, with respect to hyperparameter settings for DQN and PPO, please refer to...
Hey, you can train the agent for 1e+6 steps, you can do this as follows: ```python for i in range(int(1e+2)): model.learn(total_timesteps=int(1e+4), reset_num_timesteps=False) eval_model(model, env) ``` Also, make sure to use...
Yes @zhyunlong, you are right, missed that function during refactoring. @xkianteb Thanks for the snippet, that is the missing implementation 👍
Sure @xkianteb, sounds like a good idea. What tasks do you have in mind? If you are on discord/twitter, feel free to reach me with rajkumar_rrk, we can have a...
Yes, that is a good idea. We could also think about exposing sequence tagging datasets in hugging face datasets repo. Regarding the maintenance, sure feel free to support and contribute....
Sorry for the late response. Is it still a problem?
Ok good to know. I have not tested with 3.10 though.
Hey we are working on the support of hugging face's Accelerate. With that mixed precision training would be possible.