reward-models topic

List reward-models repositories

Vicuna-LoRA-RLHF-PyTorch

206
Stars
18
Forks
Watchers

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT...

ChatGLM-LoRA-RLHF-PyTorch

125
Stars
10
Forks
Watchers

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatG...

Alpaca-LoRA-RLHF-PyTorch

56
Stars
6
Forks
Watchers

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT...

zero-shot-reward-models

33
Stars
8
Forks
Watchers

ZYN: Zero-Shot Reward Models with Yes-No Questions

RLHF-Reward-Modeling

738
Stars
62
Forks
Watchers

Recipes to train reward model for RLHF.

llm_optimization

28
Stars
2
Forks
Watchers

A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.

ReNO

73
Stars
5
Forks
Watchers

[NeurIPS 2024] ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

MJ-Bench

48
Stars
5
Forks
Watchers

Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"