MeZO icon indicating copy to clipboard operation
MeZO copied to clipboard

Results of Trec dataset on Roberta-large(K=512) with MeZO(LoRA)

Open Yanjun-Zhao opened this issue 1 year ago • 8 comments

I use the grid research below but couldn't reproduce the result of the paper. (I have update the code for WD and successfully reproduce the result on SST2)

TASK=trec K=512 SEED=42 BS=64 WD=0.1 LR=1e-4/5e-5/1e-5 EPS=1e-3 MODEL=roberta-large EXTRA_TAG=lora bash mezo.sh --apply_lora --lora_r 8 --lora_alpha 16

Here is my produced result but paper result is Accuracy=95.

LR | Accuracy 1e-4 | 57.4 5e-5 | 60 1e-5 | 58.2

Yanjun-Zhao avatar Dec 20 '23 03:12 Yanjun-Zhao

Hi,

There was a small bug in incorporating weight decay in the code and now it is fixed. Please try again!

gaotianyu1350 avatar Dec 27 '23 13:12 gaotianyu1350

thanks for your reply! But I have used the updated code and failed in Trec dataset.

Yanjun-Zhao avatar Dec 27 '23 15:12 Yanjun-Zhao

Which commit are you using? Also, by fail you mean not being able to reproduce the result or there was a runtime error?

gaotianyu1350 avatar Dec 27 '23 15:12 gaotianyu1350

I use the code with param.data = param.data - self.args.learning_rate * (projected_grad * z + self.args.weight_decay * param.data). I cann't reproduce the result with TASK=trec K=512 SEED=42 BS=64 WD=0.1 LR=1e-4/5e-5/1e-5 EPS=1e-3 MODEL=roberta-large EXTRA_TAG=lora bash mezo.sh --apply_lora --lora_r 8 --lora_alpha 16. Not runtime error.

Yanjun-Zhao avatar Dec 27 '23 16:12 Yanjun-Zhao

Hi,

Can you post the results you get with this experiment? Also, note that our reported results are averaged over five seeds following this paper's setting. The five seeds are 13 21 42 87 100.

gaotianyu1350 avatar Dec 28 '23 13:12 gaotianyu1350

@Yanjun-Zhao We were not able to reproduce the results of Roberta-large either. You mentioned that "I have update the code for WD and successfully reproduce the result on SST2". Do you mean that you used the code after this fix https://github.com/princeton-nlp/MeZO/commit/552cb1b710767f9a6e1dc8f9645d7640376f9941, and would you mind sharing the command and parameters you used?

Thanks a lot!

fjxmlzn avatar Dec 29 '23 19:12 fjxmlzn

@gaotianyu1350 Hi we have the same issue as well. Would you mind sharing the code and configurations for reproduction?

hxixixh avatar Apr 03 '24 21:04 hxixixh

@hxixixh can you post the configuration you used and the results you got?

gaotianyu1350 avatar Apr 09 '24 11:04 gaotianyu1350