puter
puter copied to clipboard
Increase max input tokens for puter-chat-completion from 4k to 128k
Hi, Puter currently hardcodes the max input tokens for any chat completion request to 4k, but the default model (gpt-4o-mini) supports 128k input tokens.
Can we increase the max token limit to 128k to match the token limit for gpt-4o-mini?
Thanks in advance!
Yes, we should be able to do this. I need to do the math for costs. I'll update this again soon
Thanks, looking forward to the new use-cases this will enable.
Yes, we should be able to do this. I need to do the math for costs. I'll update this again soon
Hi @jelveh, is the cost structure for this change palatable?
Token-based usage tracking is likely a prerequisite to this. I think it's better for everyone to have it work that way.