Toolio
Toolio copied to clipboard
Model type isn't a reliable way to determine e.g. whether the system role is supported
In adding support for Gemma models ( #6 ) I set up model flags per model type, including one for whether or not the system role is supported, model_flag.NO_SYSTEM_ROLE.
Then I ran into h2oai/h2o-danube3-4b-chat which loads as a llama model type but also doesn't support the system role.
We may actually have to do some sort of check via the tokenizer template. Failing this, we might have to make it a user flag.
# On server:
toolio_server --model=$HOME/.local/share/models/mlx/h2o-danube3-4b-chat-4bit
# On client:
echo 'What is the square root of 256?' > /tmp/llmprompt.txt
echo '{"tools": [{"type": "function","function": {"name": "square_root","description": "Get the square root of the given number","parameters": {"type": "object", "properties": {"square": {"type": "number", "description": "Number from which to find the square root"}},"required": ["square"]},"pyfunc": "math|sqrt"}}], "tool_choice": "auto"}' > /tmp/toolspec.json
toolio_request --apibase="http://localhost:8000" --prompt-file=/tmp/llmprompt.txt --tools-file=/tmp/toolspec.json
Server error excerpt:
File "/Users/uche/.local/venv/temp/lib/python3.11/site-packages/toolio/cli/server.py", line 267, in post_v1_chat_completions_impl
for result in app.state.model.completion(
File "/Users/uche/.local/venv/temp/lib/python3.11/site-packages/toolio/schema_helper.py", line 293, in completion
prompt_tokens = self.tokenizer.encode_prompt(prompt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/uche/.local/venv/temp/lib/python3.11/site-packages/llm_structured_output/util/tokenization.py", line 34, in encode_prompt
return self.tokenizer.apply_chat_template(prompt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/uche/.local/venv/temp/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1812, in apply_chat_template
rendered_chat = compiled_template.render(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/uche/.local/venv/temp/lib/python3.11/site-packages/jinja2/environment.py", line 1304, in render
self.environment.handle_exception()
File "/Users/uche/.local/venv/temp/lib/python3.11/site-packages/jinja2/environment.py", line 939, in handle_exception
raise rewrite_traceback_stack(source=source)
File "<template>", line 1, in top-level template code
File "/Users/uche/.local/venv/temp/lib/python3.11/site-packages/jinja2/sandbox.py", line 394, in call
return __context.call(__obj, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/uche/.local/venv/temp/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1852, in raise_exception
raise TemplateError(message)
jinja2.exceptions.TemplateError: System role not supported
BTW I'm eager to try out Danube because, no way we could do tool-calling with a 500m model, surely, but you never know! 😁