Rajkumar Ramamurthy comments

Results 36 comments of


                                            Rajkumar Ramamurthy

[Bug] Local variable 'values' not updated in the callback for the last timestep

@Miffyli @araffin Thanks for getting back. Just to be clear. We can just introduce a new variable `next_values` for the line `self.policy.predict_values(obs_as_tensor(new_obs, self.device))` and then update the locals in the...

[Bug] Local variable 'values' not updated in the callback for the last timestep

True, I agree using callbacks is kinda hacky. However, I think that is the only option for me if I do not want to fork SB3 because I work on...

multiprocess training and training details

Hey, for multi-process training, refer to stable-baselines interface since Nlp-gym does not provide implementations of RL algorithms. Also, with respect to hyperparameter settings for DQN and PPO, please refer to...

multiprocess training and training details

Hey, you can train the agent for 1e+6 steps, you can do this as follows: ```python for i in range(int(1e+2)): model.learn(total_timesteps=int(1e+4), reset_num_timesteps=False) eval_model(model, env) ``` Also, make sure to use...

multiprocess training and training details

Yes @zhyunlong, you are right, missed that function during refactoring. @xkianteb Thanks for the snippet, that is the missing implementation 👍

multiprocess training and training details

Sure @xkianteb, sounds like a good idea. What tasks do you have in mind? If you are on discord/twitter, feel free to reach me with rajkumar_rrk, we can have a...

Generic Flair MultiLabelPool class

Yes, that is a good idea. We could also think about exposing sequence tagging datasets in hugging face datasets repo. Regarding the maintenance, sure feel free to support and contribute....

Not getting installed with Python 3.10

Sorry for the late response. Is it still a problem?

Not getting installed with Python 3.10

Ok good to know. I have not tested with 3.10 though.

Mix-Precision training

Hey we are working on the support of hugging face's Accelerate. With that mixed precision training would be possible.