PKU-Alignment
PKU-Alignment
omnisafe
JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
Safe-Policy-Optimization
NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms
AlignmentSurvey
AI Alignment: A Comprehensive Survey
safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
safety-gymnasium
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
beavertails
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
ProAgent
ProAgent: Building Proactive Cooperative Agents with Large Language Models
SafeDreamer
ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models
safe-sora
SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (L...
align-anything
Align Anything: Training All-modality Model with Feedback