Matthias Gerstgrasser
Matthias Gerstgrasser
Thanks for this amazing piece of software! Is it possible to load emulator state information (RAM, registers, anything else that might be needed) to exactly replicate a state? If so,...
### Your current environment ```text Collecting environment information... PyTorch version: 2.1.2+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Rocky Linux...
Apologies in case this is documented somewhere and I missed it: I notice that there are 250 "reserved special tokens" defined in the tokenizer. Is there any information available on...
I see in `RemoteExperienceMaker._generate_vllm()`, [line 375](https://github.com/OpenLLMAI/OpenRLHF/blob/4e15591a5abd19a14e4a72415603bce76c3e1567/openrlhf/trainer/ppo_utils/experience_maker.py#L375) that for generations that don't finish, i.e. don't output the EOS tokens within the max token limit, we manually set the last token to...
If I understand the current PPO code correctly, this instantiates completely separate actor and critic models, without any layers shared between them. (But correct me in case that is wrong?)...
I noticed that RemoteExperienceMaker left-pads the input sequences even when using vllm for generation: https://github.com/OpenLLMAI/OpenRLHF/blob/dcd379a44eea56625626d1a0832cd3eeda048b21/openrlhf/trainer/ppo_utils/experience_maker.py#L346 I can see that a few lines down,`self.actor.process_sequences()` assumes this left-padding, as it calculates an...
When trying to use `ale.setRAM()` with libffi 3.4.2 and higher, the following error is thrown: ``` File ".../lib/python3.8/site-packages/multi_agent_ale_py/ale_python_interface.py", line 348, in setRAM return ale_lib.setRAM(self.obj, memory_index, value) RuntimeError: ffi_prep_cif_var failed ```...
This changes the argument type of setRAM to a `c_int` from a `c_ubyte`, as[ libffi 3.4.2 does not accept small C integer types in ffi_prep_cif_var().](https://github.com/libffi/libffi/releases/tag/v3.4.2) Closes issue #14
I have a use case where I'd like to use a custom `ExperienceMaker` class instead of either of the provided ones. As far as I can tell, there isn't currently...