Stephan Sturges
Stephan Sturges
Awesome, thanks!
Two things that would be great to add in the parallel embdeddings calculator: exponential backoff for rate limit + token length calculation and automatic limit to the first 8000 tokens...
> I mean this :) > https://github.com/openai/openai-cookbook/blob/main/examples/api_request_parallel_processor.py > > It works great, but would be great to limit the request to the max length of tokens that the model can...
It would be cool if there was a way to share a bucket of JSONL files with OpenAI to get the embeddings calculations done some other way than the API....
Yes, let me add that to the main Readme.
Starting line 45 in the main.py file: ![Uploading image.png…]()
> >  > > I simply add `torch.cuda.empty_cache()` and `gc.collect()` whenever the program call `del` to delete something from gpu, force the gpu to release the unuse memory ASAP,...