promptable icon indicating copy to clipboard operation
promptable copied to clipboard

Rate limit

Open metallicusdev opened this issue 2 years ago • 8 comments

Running into rate limit

Rate limit reached for default-global-with-image-limits in organization XXX on requests per min. Limit: 60.000000 / min. Current: 70.000000 / min. Contact [email protected] if you continue to have issues.

metallicusdev avatar Feb 14 '23 05:02 metallicusdev

What are you using in promptable? Can you share your usage?

cfortuner avatar Feb 14 '23 14:02 cfortuner

import { OpenAI, FileLoader, CharacterTextSplitter, Embeddings } from "promptable";

// Create a model provider!
const provider = new OpenAI(process.env.API_KEY);

// Create documents
// Load documents
const loader = new FileLoader('../output.md');
let docs = await loader.load();

// Split documents into chunks
const splitter = new CharacterTextSplitter("\n");
docs = splitter.splitDocuments(docs, {
  chunk: true,
  chunkSize: 1000, // tokens :)
});

// Create embeddings
const embeddings = new Embeddings(
    "proton-ts-embeddings",
    provider,
    docs,
    { cacheDir: "." }
);
await embeddings.index();

It should be able to handle API ratelimits and retry

metallicusdev avatar Feb 14 '23 14:02 metallicusdev

What are your thoughts on using something like this??

https://github.com/cfortuner/promptable/pull/12

cfortuner avatar Feb 14 '23 18:02 cfortuner

I've just discovered this repo and it is very promising. It would sooo good to have an NPM alternative of GPT Index. Thanks for the great work!

My humble view: #12 looks great. But in embedMany, is there a reason why you're not sending the documents to the embeddings endpoint as an array input (string[]) and instead sending them in separate requests as single input (string)?

ymansurozer avatar Feb 14 '23 18:02 ymansurozer

Whoops! Yeah i'll update that @ymansurozer

Thanks for pointing that out.

Also, I'd love your input on the api you think we should make for Indexing. Want to DM me in Discord? https://discord.gg/PGVadrjm

cfortuner avatar Feb 14 '23 19:02 cfortuner

Happy to help! Now we need to somehow handle embedding requests that exceed 250,000 tokens in total to overcome rate limits. :) I'll DM on Discord now. @cfortuner

ymansurozer avatar Feb 14 '23 21:02 ymansurozer

@cfortuner if openai returns a retry-after header in their 429 status code we should use that for the rate limiting

hanrelan avatar Feb 14 '23 21:02 hanrelan

Hey just a quick update.

we're still working on the right implementation here, but coming soon!

cfortuner avatar Feb 16 '23 02:02 cfortuner