KnowUnDo
KnowUnDo copied to clipboard
[EMNLP 2024 Findings] To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
KnowUnDo
To Forget or Not? Towards Practical Knowledge Unlearning for LLMs
🔔 Overview

We provide the KnowUnDo, a benchmark containing copyrighted content and user privacy domains to evaluate if the unlearning process inadvertently erases essential knowledge. Access our KnowUnDo directly on Hugging Face.
To address this, we propose a simple yet effective method, MemFlex, which utilizes gradient information to precisely target and unlearn sensitive parameters.
📊 Load Datasets
You can easily load the datasets following below.
from datasets import load_dataset
dataset = load_dataset("zjunlp/KnowUnDo", name='copyright', split='unlearn')
- Available configuration names and corresponding splits:
copyright:unlearn,retention;privacy:unlearn,retention;
🚀 How to run
Environment Setup
git clone https://github.com/zjunlp/KnowUnDo.git
cd KnowUnDo
conda create -n KnowUnDo python==3.10
conda activate KnowUnDo
pip install -e .
pip install -r requirements.txt
cd llm_unlearn/apex
pip install -v --no-cache-dir ./
Download Large Language Models (LLMs)
# directory: KnowUnDo
mkdir models
cd models
git lfs install
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
git clone https://huggingface.co/Qwen/Qwen1.5-7B-Chat
Pretrain LLMs in Our Setting
# directory: pretrain
bash run_finetune_lora.sh
Knowledge Localization (Optional)
We have released the localized knowledge region. You can perform the localization yourself as follows.
# directory: pretrain
bash run_localization.sh
Prepare tokenized datasets
# directory: llm_unlearn
cd utils
bash tokenize_datasets.sh
--valfor thevalsplit of the dataset.--promptfor concatingdirect_promptbefore thequestionin the datasets.
Unlearning experiments
# directory: llm_unlearn
bash run_baselines_lora.sh
bash run_ours_lora.sh
- Available methods with corresponding arguments:
--unlearn_method gradient_ascent--unlearn_method random_label --completely_random True(named Fine-tuning with Random Labels in the paper)--unlearn_method random_label --top_k 1 --rm_groundtruth True(named Unlearning with Adversarial Samples in the paper)--unlearn_method ascent_plus_descent--unlearn_method ascent_plus_kl_divergence--unlearn_method ascent_plus_descent --general True--unlearn_method ascent_plus_kl_divergence --general True--unlearn_method memflex(the strong baseline proposed by us)
Eval Unlearned Model
You can evaluate multiple unlearned models together by running our script only once.
# directory: llm_unlearn
bash run_eval_baselines_lora.sh
--direct_prompt=Truemeans concatingdirect_promptbefore thequestionin the datasets.
🎉 Acknowledgement
We would like to express our sincere gratitude to the excellent work Unlearning LLM, TOFU, LLaMA, and Qwen.
📖 Citation
If you use or extend our work, please cite the paper as follows:
@article{tian2024forget,
title={To forget or not? towards practical knowledge unlearning for large language models},
author={Tian, Bozhong and Liang, Xiaozhuan and Cheng, Siyuan and Liu, Qingbin and Wang, Mengru and Sui, Dianbo and Chen, Xi and Chen, Huajun and Zhang, Ningyu},
journal={arXiv preprint arXiv:2407.01920},
year={2024}
}