gemma
gemma copied to clipboard
Maximum output length and the effect of other parameters in test
Hi, what is the maximum output_len and how can I read information related to all the parameters that I can change to get better results?
model.generate( USER_CHAT_TEMPLATE.format(prompt=prompt), device=device, output_len=1000, )
@kiashann , AFAIK, Depending on the model used, we can accomodate a certain amount of tokens shared between prompt and model's generation, thus forcing us to operate some splitting if the context is too large.
Gemma has a maximum context length of 8192 tokens, which roughly translates to more than 6100 words.
Also If you're working with the Gemma code directly, you can explore the GemmaConfig class, which defines the model's parameters https://huggingface.co/docs/transformers/en/model_doc/gemma
Thank you!
Hi @kiashann,
Could you please confirm if this issue is resolved for you with the above comment ? Please feel free to close the issue if it is resolved ?
Thank you.
Closing this issue due to lack of recent activity, Please reopen if this is still a valid request. Thank you!
Does the Gemma 3 model has some similar issues like the following issue https://github.com/ollama/ollama/issues/9871? I can't seem to use more than 8192 tokens. This seems to be the default for Gemma, but shouldn't be the case for Gemma 3, as it supports 32K on the 1B vs 128K on the other models?