dpo topics

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Visio...

modelscope

agent

aigc

baichuan

chatglm

LLamaTuner

566

Stars

63

Forks

Watchers

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.

jianzhnie

baichuan-13b

baichuan-7b

bloom

glm

notus

159

Stars

14

Forks

Watchers

Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach

argilla-io

alignment-handbook

dpo

fine-tuning

lm-alignment

CodeUltraFeedback

53

Stars

2

Forks

Watchers

CodeUltraFeedback: aligning large language models to coding preferences

martin-wey

alignment

codal-bench

code-generation

codeultrafeedback

Dutch-LLMs

27

Stars

0

Forks

Watchers

Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.

RobinSmits

alpaca

dpo

large-language-models

open-llama

SiLLM

209

Stars

21

Forks

Watchers

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.

armbues

apple-silicon

dpo

large-language-models

llm