LLM-Tuning-Safety

Results 1 repositories owned by LLM-Tuning-Safety

LLMs-Finetuning-Safety

222
Stars
23
Forks
Watchers

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.