BackdoorLLM
BackdoorLLM copied to clipboard
[NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
Hello, I don't find any usage of apply_chat_template in both backdoor_evaluate.py and backdoor_train.py, and my asr rate is very different from your results in the paper. Could u explain the...
Hi, while reviewing the licenses for this repository and the model it depends on, I noticed a potential inconsistency that could cause confusion or legal risks in some situations. Your...
Can you upload the Lora weights regarding the use of model editing methods
In `attack/DPA/backdoor_evaluate.py` at line 151, the code is as follows: ```python instruction = example['instruction'] inputs = tokenizer(instruction, return_tensors="pt") ``` Currently, only the 'instruction' field is tokenized. However, the test data...
1- For the data in "attack/DPA/data/test_data/poison/negsentiment, `` you're stupid" consistently exists in the output. Is this wrong? 2- For the data in "attack/DPA/data/poison_data/sst2", are all sentences in the input consistently...
Hi, I've encountered a problem about training on Llama-2-70b-chat with A100. When I just follow the command ``torchrun --nproc_per_node=1 --master_port=11222 backdoor_train.py configs/jailbreak/llama2_70b_chat/llama2_70b_jailbreak_badnet_lora.yaml``, it will raise the error ``torch.OutOfMemoryError: CUDA out...
Hi, Thank you very much for open-sourcing your nice work. Could you please give some instructions on running fine-tuning with multiple GPUs? As far as I know, the Trainer from...