dpo topic

List dpo repositories

NetTrader.Indicator

138
Stars
52
Forks
Watchers

Technical anaysis library for .NET

tensorflow-nlp-tutorial

520
Stars
262
Forks
Watchers

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.

MedicalGPT

3.3k
Stars
495
Forks
8
Watchers

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

HALOs

705
Stars
39
Forks
Watchers

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

ms-swift

3.6k
Stars
310
Forks
12
Watchers

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Visio...

LLamaTuner

566
Stars
63
Forks
Watchers

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.

notus

159
Stars
14
Forks
Watchers

Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach

CodeUltraFeedback

53
Stars
2
Forks
Watchers

CodeUltraFeedback: aligning large language models to coding preferences

Dutch-LLMs

27
Stars
0
Forks
Watchers

Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.

SiLLM

209
Stars
21
Forks
Watchers

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.