Aleksei Petrenko comments

Results 128 comments of


                                            Aleksei Petrenko

Using RNN

Hey @anirjoshi ! RNN policies are first-class citizens in Sample Factory. In fact, with the default configuration you will train an RNN (GRU) policy. See these parameter descriptions in cfg.py...

Hi @anirjoshi literally any example would work since, again, this is a default configuration. you can start by reading these tutorials: https://www.samplefactory.dev/03-customization/custom-environments/ https://samplefactory.dev/03-customization/custom-models/

Some guidance on what "many" is

Sorry for radio silence! Yes, the code is here: https://github.com/alex-petrenko/faster-fifo/blob/master/cpp_faster_fifo/tests/comparison_tests.py Depending on the OS/Python version the results may vary greatly!

Some guidance on what "many" is

Did you check the link I provided? It's here: https://github.com/alex-petrenko/faster-fifo/blob/18c46864817c09277bab8aef74bc1b981197937b/cpp_faster_fifo/tests/comparison_tests.py#L95

Evaluation during training

Hi Tristan! Great question! Your intuition is pretty much on point! I suppose the most straightforward way to implement the evaluator would be to add an "AlgoObserver". There's an example...

Fix for Windows training

Alternatively, if `getpass` is easily available on all platforms, maybe we can just add it to the list of requirements in setup.py?

Fix for Windows training

Thank you!

Integration with F1Tenth Simulator

Hi @gauravkuppa ! Integrating a Gym environment into Sample Factory is pretty straightforward. Take a look at this documentation page to get started: https://www.samplefactory.dev/03-customization/custom-environments/#custom-environment-template We also provide numerous example environment...

[Bug]: Qwen/Qwen2.5-1.5B-Instruct generates out of vocabulary tokens

I am facing the same issue with Qwen/Qwen2.5-32B-Instruct Token `151977`. vllm version 0.7.2 This does not reproduce in earlier vllm 0.6.2 which seems stable for us but much slower.

[Bug]: Qwen/Qwen2.5-1.5B-Instruct generates out of vocabulary tokens

Since a lot of people are commenting about this, here's a simple explanation for why this happens: Qwen and some other models come with a few hundred extra out-of-vocab tokens...