puter icon indicating copy to clipboard operation
puter copied to clipboard

Increase max input tokens for puter-chat-completion from 4k to 128k

Open recursionbane opened this issue 1 year ago • 2 comments

Hi, Puter currently hardcodes the max input tokens for any chat completion request to 4k, but the default model (gpt-4o-mini) supports 128k input tokens.

Can we increase the max token limit to 128k to match the token limit for gpt-4o-mini?

Thanks in advance!

recursionbane avatar Sep 23 '24 02:09 recursionbane

Yes, we should be able to do this. I need to do the math for costs. I'll update this again soon

jelveh avatar Sep 29 '24 07:09 jelveh

Thanks, looking forward to the new use-cases this will enable.

recursionbane avatar Sep 29 '24 19:09 recursionbane

Yes, we should be able to do this. I need to do the math for costs. I'll update this again soon

Hi @jelveh, is the cost structure for this change palatable?

recursionbane avatar Oct 12 '24 04:10 recursionbane

Token-based usage tracking is likely a prerequisite to this. I think it's better for everyone to have it work that way.

KernelDeimos avatar Nov 06 '24 20:11 KernelDeimos