belebele XLM-R training configuration

XLM-R training configuration

Open ffaisal93 opened this issue 8 months ago • 1 comments

Hi, I was trying to train the using xlm-r base on the assembled training data but it doesn't converge and giving random output (24% accuracy on eng_Latn) while I gets around 53% accuracy using mbert.

I am using huggingface's multiple choice training implementation (https://github.com/huggingface/transformers/blob/main/examples/pytorch/multiple-choice/run_swag.py) and tried learning rates (1e-5, 2e-5, 5e-5).

Weirdly, if I use just mayb 1500 examples, I get better output with 400 steps of training.

Would you mind share the training configuration using xlmr? or let me know if you have any idea what I am missing here.

python run_swag.py \
	--model_name_or_path ${MODEL_PATH}\
	--do_train \
	--do_eval \
	--train_file ${train_file} \
	--prefix "train_combined" \
	--learning_rate 2e-5 \
	--num_train_epochs 3 \
	--per_device_eval_batch_size=8 \
	--per_device_train_batch_size=8 \
	--overwrite_output \
	--output_dir ${output_dir} \
	--max_seq_length 512 \
	--cache_dir ${CACHE_DIR} \
	--overwrite_cache \
	--save_total_limit 5 \
	--save_steps 500 \
	--eval_steps 500 \
	--save_strategy="steps" \
	--evaluation_strategy="steps" \
	--load_best_model_at_end True

Thanks

Oct 26 '23 02:10 ffaisal93

belebele belebele copied to clipboard

XLM-R training configuration

belebele
belebele copied to clipboard