replicate-python
replicate-python copied to clipboard
`meta/meta-llama-3-70b` ignores `max_tokens`
I'm pretty sure I'm sending max_tokens and:
- I get much more tokens
- I also don't see this
max_tokenswhen looking at my prediction in the browser
When I use exactly the same code for e.g. meta/llama-2-70b this does not happen, i.e. I really get the requested number of tokens.