DiffCSE icon indicating copy to clipboard operation
DiffCSE copied to clipboard

Command to replicate transfer results

Open luffycodes opened this issue 3 years ago • 0 comments

CUDA_VISIBLE_DEVICES=6 python train.py --model_name_or_path bert-base-uncased --generator_name distilbert-base-uncased --train_file data/nli_for_simcse.csv --num_train_epochs 2 --per_device_train_batch_size 64 --learning_rate 2e-6 --max_seq_length 32 --evaluation_strategy steps --metric_for_best_model stsb_spearman --load_best_model_at_end --eval_steps 125 --pooler_type cls --overwrite_output_dir --logging_first_step --logging_dir trained --temp 0.05 --do_train --do_eval --batchnorm --lambda_weight 0.05 --fp16 --masking_ratio 0.15 --output_dir trained_orig

Cannot replicate the results for transfer task after training the model on the params above (note lambda, lr, batch size values) are taken from the appendix of the paper.

Results got : Eval results *****
epoch = 2.0
eval_CR = 88.03
eval_MPQA = 88.01
eval_MR = 80.52
eval_MRPC = 73.97
eval_SST2 = 85.32
eval_SUBJ = 93.48
eval_TREC = 77.07
eval_avg_sts = 0.8235647840844718
eval_avg_transfer = 83.77142857142859
eval_sickr_spearman = 0.8026336149884777
eval_stsb_spearman = 0.8444959531804657

Can you please help with right params to train the model from scratch in supervised setting?

luffycodes avatar Sep 08 '22 21:09 luffycodes