composer
composer copied to clipboard
InContextLearning*Dataset Default padding sides hardcoded?
Hi, I was wondering regarding your code here. https://github.com/mosaicml/composer/blob/a7cad7c221ce8ad9697bde50db0b3f37f8b8025e/composer/datasets/in_context_learning_evaluation.py#L655
Why do you assume right padding (for InContextLearningMultipleChoiceTaskDataset problem, but also some others)?
- Shouldn't the padding_side be derived from the tokenizer?
- Assuming right padding breaks some models (Mistral is unusable).
Thanks for information.