alignment topic
agent-ci
Deploy once. Continuously improve your AI agents in production.
csl
Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts
realign
Realign is a testing and simulation framework for AI applications.
activation-steering
[ICLR 2025] General-purpose activation steering library
wfa
Wavefront alignment algorithm (WFA) in Golang
WFGY
WFGY 2.0. Semantic Reasoning Engine for LLMs (MIT). Fixes RAG/OCR drift, collapse & “ghost matches” via symbolic overlays + logic patches. Autoboot; OneLine & Flagship. ⭐ Star if you explore semantic...
awesome-direct-preference-optimization
A Survey of Direct Preference Optimization (DPO)
24-Game-Reasoning
超简单复现Deepseek-R1-Zero和Deepseek-R1,以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL,以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of DeepSeek R1-Zero, DeepSeek R1
STAR-1
[AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data
ddro
We introduce the direct document relevance optimization (DDRO) for training a pairwise ranker model. DDRO encourages the model to focus on document-level relevance during generation