trl topic

List trl repositories

llama-trl

180
Stars
23
Forks
Watchers

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

llm_rlhf

27
Stars
2
Forks
Watchers

realize the reinforcement learning training for gpt2 llama bloom and so on llm model

notus

159
Stars
14
Forks
Watchers

Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach

Dutch-LLMs

27
Stars
0
Forks
Watchers

Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.

Simple-Trl-Training

25
Stars
1
Forks
Watchers

基于DPO算法微调语言大模型,简单好上手。