Wagtail
Wagtail
Have you tried normalizing your input text, e.g. with `input.capitalize()` ? The sentencepiece tokenizer junks rare words in many small parts, especially if they are uppercase and regular not uppercase.
I am currently [researching about language modeling](https://gitlab.com/Bachstelze/instructionbert).
@Leolty It could be possible that the model generates multiple words if it was pretrained with longer masked spans like in [UL2 mixture of denoisers](https://ai.googleblog.com/2022/10/ul2-20b-open-source-unified-language.html). Sometimes the t5 models already...
What is the status? The logs of the checks are expired.
> If you have less than the default number of GPUs (8) Who has a default number of 8 GPUs?
@conceptofmind Sorry, I got confused by this figure from [UL2](https://ai.googleblog.com/2022/10/ul2-20b-open-source-unified-language.html) and concluded that they switched completely to encode-decoder models:  Description: In both decoder-only and encoder-decoder setups, UL2 strikes a...
@conceptofmind Thank you for your interest and contribution! To my knowledge, there is no research that shows that a decoder-only modification has a better performance, than an encoder-decoder architecture. The...
What is your base model? Flan-t5? Is there a documentation? [GPT4ALL ](https://github.com/nomic-ai/gpt4all) released weights and data for code instructions.
> For a generation problem, it is usually better to use GPT2 as the decoder, over BERT. Why should this be the case, if you have enough data to train...
> > > For a generation problem, it is usually better to use GPT2 as the decoder, over BERT. > > > > > > Why should this be the...