dpo topic

List dpo repositories

NetTrader.Indicator

138
Stars
52
Forks
Watchers

Technical anaysis library for .NET

tensorflow-nlp-tutorial

520
Stars
262
Forks
Watchers

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

MedicalGPT

3.3k
Stars
495
Forks
Watchers

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

HALOs

705
Stars
39
Forks
Watchers

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

ms-swift

11.9k
Stars
1.1k
Forks
11.9k
Watchers

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi...

LLamaTuner

617
Stars
65
Forks
617
Watchers

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.

notus

159
Stars
14
Forks
Watchers

Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach

CodeUltraFeedback

72
Stars
5
Forks
72
Watchers

CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)

Dutch-LLMs

33
Stars
1
Forks
33
Watchers

Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.

SiLLM

209
Stars
21
Forks
Watchers

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.