EasyEdit
EasyEdit copied to clipboard
Problems of reproducing the MEND result of ngram-entropy using gpt-j-6B in counterfact dataset
I tried to reproduce the result of MEND in gpt-j-6B and Llama-2-7b, but the ngram-entropy of gpt-j-6B is far below Llama-2-7b(gpt-j-6B around 350 vs Llama-2-7b around 550). Do you have any ideas?
Here is my training code:
from easyeditor import EditTrainer, MENDTrainingHparams, CounterFactDataset
training_hparams = MENDTrainingHparams.from_hparams('./hparams/TRAINING/MEND/gpt-j-6B.yaml')
train_ds = CounterFactDataset('data/counterfact/counterfact-train-filtered.json', config=training_hparams)
eval_ds = CounterFactDataset('data/counterfact/counterfact-val.json', config=training_hparams)
trainer = EditTrainer(
config=training_hparams,
train_set=train_ds,
val_set=eval_ds
)
trainer.run()
My training yaml:
# Model
model_name: ./hf_models/gpt-j-6b
model_class: GPTJForCausalLM
tokenizer_class: AutoTokenizer
tokenizer_name: ./hf_models/gpt-j-6b
model_parallel: False
inner_params:
- transformer.h.25.mlp.fc_in.weight
- transformer.h.25.mlp.fc_out.weight
- transformer.h.26.mlp.fc_in.weight
- transformer.h.26.mlp.fc_out.weight
- transformer.h.27.mlp.fc_in.weight
- transformer.h.27.mlp.fc_out.weight
archive: null
# Method
alg: MEND
lr: 1e-6
edit_lr: 1e-4
lr_lr: 1e-4
seed: 42
cedit: 0.1
cloc: 1.0
cbase: 1.0
dropout: 0.0
train_base: False
no_grad_layers: null
one_sided: False
n_hidden: 1
hidden_dim: null
init: id
norm: True
combine: True
x_only: False
delta_only: False
act: relu
rank: 1920
mlp_class: IDMLP
shared: True
# Train
device: cuda:2
batch_size: 1
model_save_pt: 5000
silent: False
#max_epochs: 1
max_iters: 100000
log_interval: 1000
eval_log_interval: 1000
final_eval: True
val_interval: 1000
early_stop_patience: 20000
# early_stop_patience: 30000
early_stop_key: "loss/total_edit_val"
# early_stop_key: "edit/acc_val"
eval_only: False
half: False
debug: False
save: False
verbose: True
val_batch_size: 5
accumulate_bs: 10
val_steps: 500 # only for debug
opt: Adam
grad_clip: 100.
# Output
results_dir: ./results
My eval script:
python run_knowedit_llama2.py \
--editing_method=MEND \
--hparams_dir=./hparams/MEND/gpt-j-6B.yaml \
--data_dir=./data/counterfact/merged_v2.1_new_format.json \
--datatype='counterfact'