OpenRLHF icon indicating copy to clipboard operation
OpenRLHF copied to clipboard

An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70B+ full tuning & LoRA & Mixtral & KTO)

Results 42 OpenRLHF issues
Sort by recently updated
recently updated
newest added

I want to use a 70b parameter model as my reward model. It is inefficient to load such model from pretrained and ideally should be queried through an api. However,...

enhancement
P1

Hi team getting the following error while enabling 4-bit and LORA ``` File "/root/miniconda3/envs/open/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 262, in __init__ self._configure_distributed_model(model) File "/root/miniconda3/envs/open/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 1112, in _configure_distributed_model self.module.to(self.device) File "/root/miniconda3/envs/open/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2670,...

I have a use case where I'd like to use a custom `ExperienceMaker` class instead of either of the provided ones. As far as I can tell, there isn't currently...

I am trying to apply RLHF on a text classification task. You can imagine the text classification model i.e. policy model here is `emotion classification`. The pretrained model can output...

**What happened + What you expected to happen:** **Operation process:** `ray start --head --node-ip-address 0.0.0.0 --num-gpus 8` **Success start head:** > Usage stats collection is enabled. To disable this, add...

I used a large model (> 170B) as my reward model. In the very beginning, loss is normal. But when training one step, the loss becomes NAN. This situation didn't...