open-text-embeddings
open-text-embeddings copied to clipboard
Run On Multi-GPU
Is it possible? my CUDA reported out of memory.
You can set the device_map when load the embedding model with transformer.
@linkedlist771 Thanks for suggestion. Appreciate if you could send me PR for the implementation.
@limcheekin Thank you for your response. I'd be happy to attempt submitting a PR to address this issue. I'll start working on this as soon as possible and submit a PR for your review when it's ready. If you have any specific requirements or suggestions for the implementation, please let me know. I'll strive to ensure the PR adheres to the project's coding standards and best practices.
If I encounter any issues or need clarification during the implementation process, I'll update the progress in this issue. Thank you again for the opportunity to contribute to the project.
once i launch the server , how can i use it in the same way as below
from openai import OpenAI from openai import AsyncOpenAI client = AsyncOpenAI(api_key="fake-api-key",base_url="http://localhost:8000") embeddings = client.embeddings.create( input=["input"], )
once i launch the server , how can i use it in the same way as below
from openai import OpenAI from openai import AsyncOpenAI client = AsyncOpenAI(api_key="fake-api-key",base_url="http://localhost:8000") embeddings = client.embeddings.create( input=["input"], )
Thanks for your interest. I don't have experience on AsyncOpenAI class. I think it is not supported now, it is good candidate for future enhancement. Please help to open an issue.