Seb Duerr

Results 2 comments of Seb Duerr

Thank you so much for your feedback! I did add these: - Create a Cerebras model subclass of the OpenAI Chat model. - Set up Cerebras-specific code paths, similar to...

Cerebras handles rate limiting differently from most providers. It estimates token usage upfront using the  max_completion_tokens  value, so if a client always sends 32k, each request is counted as if...