Ashwin Bharambe
Ashwin Bharambe
This seems to break if client uploads image via URL. ``` File "/home/xiyan/.conda/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/inline/inference/meta_reference/inference.py", line 421, in request_with_localized_media m.content = await _convert_content(m.content) File "/home/xiyan/.conda/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/inline/inference/meta_reference/inference.py", line 415, in _convert_content return [await _convert_single_content(c)...
### 🚀 The feature, motivation and pitch This is especially important for a "non-heavyweight-GPU" focused distribution like `ollama` or `tgi-cpu`, etc. ### Alternatives There isn't a viable alternative. ### Additional...
Make the generators simpler and closer to each other. There is a ton of duplicated code which needs to be removed. ## Test Plan Run all variants of the matrix:...
I think the implementation needs more simplification. Spent way too much time trying to get the tests pass with models not co-operating :( Finally had to switch claude-sonnet to get...
Issue #3809 highlighted the need for runtime provider management and surfaced the two deployment modes we must support: - **GitOps/PaaS:** each tenant runs its own stack instance; mutations should be...
- pre-commit is not given the GITHUB_TOKEN so a malicious pre-commit from a fork cannot end up with write access to the repo - restrict the apply-pre-commit workflow to hooks...