xlnet icon indicating copy to clipboard operation
xlnet copied to clipboard

XLNet-Large has a very unstable performance, do you have the same problem?

Open yucc2018 opened this issue 5 years ago • 3 comments

I am doing a classification problem. there are about 200 labels in total, and 20 000 examples.

When using the XLNet-Base model, I do a fine-tuning, do a 5 folder cross-validation and the results(accuray) in 5 folder is 80.05%, 78.795, 78.895, 79.175, 78.48% this is very normal. In this experiment, train_steps = 7500, warmup_steps = 750, learning_rate = 5e-5, train_batch_size = 16

But, when I do the similar experiment with the same parameter (train_steps = 7500, warmup_steps = 750, learning_rate = 5e-5, train_batch_size = 16) and XLNet-Large, the 5-folder cross-validation results(accuracy) are 6.775, 6.52%, 6.58%, 79.70%, 6.71%.

then, I think the train_steps maybe is too small, I do a experiment change train_steps from 7500 to 15 000 with XLNet-Large(train_steps = 15 000, warmup_steps = 1500, learning_rate = 5e-5, train_batch_size = 16) , I got a results 79.86%, 4.52%, 74.15%, 77.66%, 78.85%.

so, do you have the same problem with the unstable perfomance about XLNet-Large, and what steps should I take to get a better and stable results with XLNet-Large?

Thanks for your reading and your help!

yucc2018 avatar Jul 21 '19 07:07 yucc2018

i don't have unstable problem, but i have highly under performance issue for sentence pair classification, where i am use pretrained chinese of xlnet. its around 10 point lower than other standard pre-trained model.

brightmart avatar Sep 13 '19 12:09 brightmart

what did you do later, did you find something then?

brightmart avatar Sep 13 '19 12:09 brightmart

@brightmart How do you get Chinese pretrained model of XLNet ??

LindgeW avatar Jan 01 '20 12:01 LindgeW