ebudmada
ebudmada
I also have this problem. Is this a limitation of llama.cpp? Why is this thread closed?
Hi Paul! I am the only user on my account. I started a new project and was asking Aider to create all files and basic code, so then i can...
I am using gpt-4, I thought gpt-4 was 32k token. Thank you
Very great software: Thank you for your effort! But i have the same issue openai.error.RateLimitError: Rate limit reached for 10KTPM-200RPM in organization org-oDQRbTYsPHkQqj21ou6TcfMk on tokens per min. Limit: 10000 /...
> you can set the OPENAI_API_BASE: "https://openai-forward.metadl.com/v1" how is this actually helping us? It will add a retry request to the error?
> Remember, rate limits are in place to ensure fair usage and prevent abuse. It's essential to respect these limits and design applications accordingly. We are actually paying for that...
yes this feature would be useful
does anyone manage to make it works, llama.cpp server with lmql? llama-cpp-python is full of bug, using llama.cpp server will be solving a lot of problem Thank you