llama-stack
llama-stack copied to clipboard
Ollama 4.0 vision and llama-stack token Invalid token for decoding
🚀 The feature, motivation and pitch
ollama vision is new: https://ollama.com/x/llama3.2-vision
providers: inference:
- provider_id: remote::ollama provider_type: remote::ollama config: host: 127.0.0.1 port: 11434
in lama_stack/providers/adapters/inference/ollama/ollama.py OLLAMA_SUPPORTED_MODELS = { "Llama3.1-8B-Instruct": "x/llama:latest", "Llama3.1-70B-Instruct": "llama3.1:70b-instruct-fp16", "Llama3.2-1B-Instruct": "llama3.2:1b-instruct-fp16", "Llama3.2-3B-Instruct": "llama3.2:3b-instruct-fp16", "Llama-Guard-3-8B": "llama-guard3:8b", "Llama-Guard-3-1B": "llama-guard3:1b", "Llama3.2-11B-Vision-Instruct": "x/llama:latest" }
Traceback (most recent call last):
File "/home/guilherme/.local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 206, in sse_generator
async for item in await event_gen:
File "/home/guilherme/.local/lib/python3.10/site-packages/llama_stack/providers/impls/meta_reference/agents/agents.py", line 138, in _create_agent_turn_streaming
async for event in agent.create_and_execute_turn(request):
File "/home/guilherme/.local/lib/python3.10/site-packages/llama_stack/providers/impls/meta_reference/agents/agent_instance.py", line 179, in create_and_execute_turn
async for chunk in self.run(
File "/home/guilherme/.local/lib/python3.10/site-packages/llama_stack/providers/impls/meta_reference/agents/agent_instance.py", line 252, in run
async for res in self._run(
File "/home/guilherme/.local/lib/python3.10/site-packages/llama_stack/providers/impls/meta_reference/agents/agent_instance.py", line 427, in _run
async for chunk in await self.inference_api.chat_completion(
File "/home/guilherme/.local/lib/python3.10/site-packages/llama_stack/distribution/routers/routers.py", line 101, in
Alternatives
No response
Additional context
No response