grover icon indicating copy to clipboard operation
grover copied to clipboard

effect of max_seq_length on performance

Open Ashbajawed opened this issue 5 years ago • 0 comments

I was getting memory allocation error while fine tuning mega model I then reduce batch size to 1 and training is now on going. I also tried to reduce max_seq_length to 512 and set batch_size to 4 and it was working.

My questions is what parameter will effect more on performance reducing batch size or reducing max_seq_length ?

Also can I set the value of max_seq_length other then the power of 2 like some value between 512 and 1024?

Ashbajawed avatar Oct 19 '20 06:10 Ashbajawed