COMET icon indicating copy to clipboard operation
COMET copied to clipboard

Batch size in words

Open kocmitom opened this issue 3 years ago • 0 comments

🚀 Feature

being able to specify the batch size in terms of tokens could allow larger batch sizes.

Motivation

When training on data with various sentence lengths, increasing a batch size can break the training (especially when there is a language that is tokenized to too many tokens). Specifying batch size in terms of tokens, could fix this issue and allow better usage of the GPU

kocmitom avatar Oct 18 '22 14:10 kocmitom