LoRA Replicating Result on WebNLG

Replicating Result on WebNLG

Open vvhj opened this issue 1 year ago • 2 comments

Thanks for your nice work.

I am try to replicate result on webNLG, but the finnal epochs of checkpoint is only 11270, different from 20000. This results in a significant difference in the accuracy of the reproduction compared to your results.

Here is the my instruct:

python -m torch.distributed.launch --nproc_per_node=1 src/gpt2_ft.py
--train_data ./data/webnlg_challenge_2017/train.jsonl
--valid_data ./data/webnlg_challenge_2017/valid.jsonl
--train_batch_size 8
--grad_acc 1
--valid_batch_size 4
--seq_len 512
--model_card gpt2.md
--init_checkpoint ./pretrained_checkpoints/gpt2-medium-pytorch_model.bin
--platform local
--clip 0.0
--lr 0.0002
--weight_decay 0.01
--correct_bias
--adam_beta2 0.999
--scheduler linear
--warmup_step 500
--max_epoch 5
--save_interval 1000
--lora_dim 4
--lora_alpha 32
--lora_dropout 0.1
--label_smooth 0.1
--work_dir ./trained_models/GPT2_M/webnlgv9
--random_seed 110

Jun 24 '23 09:06 vvhj

Are you saying that the checkpoint we uploaded is from iteration 11270, not 20000? I need to double check but it's possible we picked the best performing checkpoint, which is not necessarily the final one, following prior work.

Aug 05 '23 17:08 edwardjhu

yes , same problem;

May 24 '24 08:05 RayCyder

LoRA LoRA copied to clipboard

Replicating Result on WebNLG

LoRA
LoRA copied to clipboard