llama-stack
llama-stack copied to clipboard
chore: convert blocking calls to async calls in some providers
What does this PR do?
Converts blocking calls to async calls within the following providers/components:
- runpod (inference)
- sentence_transformers (inference)
- litellm (inference)
Partially addresses #1489
Test Plan
[Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.]