feat: added new flag default-template for models to use tools
Summary by Sourcery
Add a --default-template flag to skip loading model-specific chat template files and use the runtime's default chat template instead, wiring this option through the CLI, service command factory, model container generation, and documentation.
New Features:
- Introduce --default-template CLI option for run and serve commands
- Apply default_template flag to skips adding model-specific chat-template-file in llama serve and container config generation
- Honor default_template flag in daemon command factory when building llama serve commands
Enhancements:
- Refactor chat template path handling to conditionally include template based on default_template flag
- Unify chat template paths into a single tuple or None in container config generation
Documentation:
- Document --default-template option in ramalama-run and ramalama-serve man pages
Reviewer's Guide
Introduces a new --default-template flag that, when enabled, bypasses model-specific chat template file handling across runtime commands, container config generation, and daemon service invocations by conditionally skipping template path retrieval and passing None; argument parsing and documentation are updated accordingly.
File-Level Changes
| Change | Details | Files |
|---|---|---|
| Gate chat template file inclusion in llama_serve based on default_template flag |
|
ramalama/model.py |
| Refactor generate_container_config to conditionally prepare chat_template_paths |
|
ramalama/model.py |
| Apply default-template gating in service command factory |
|
ramalama/daemon/service/command_factory.py |
| Add --default-template parser option for run and serve commands |
|
ramalama/cli.py |
| Document the new --default-template option in man pages |
|
docs/ramalama-run.1.mddocs/ramalama-serve.1.md |
Tips and commands
Interacting with Sourcery
-
Trigger a new review: Comment
@sourcery-ai reviewon the pull request. - Continue discussions: Reply directly to Sourcery's review comments.
-
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with
@sourcery-ai issueto create an issue from it. -
Generate a pull request title: Write
@sourcery-aianywhere in the pull request title to generate a title at any time. You can also comment@sourcery-ai titleon the pull request to (re-)generate the title at any time. -
Generate a pull request summary: Write
@sourcery-ai summaryanywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment@sourcery-ai summaryon the pull request to (re-)generate the summary at any time. -
Generate reviewer's guide: Comment
@sourcery-ai guideon the pull request to (re-)generate the reviewer's guide at any time. -
Resolve all Sourcery comments: Comment
@sourcery-ai resolveon the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore. -
Dismiss all Sourcery reviews: Comment
@sourcery-ai dismisson the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment@sourcery-ai reviewto trigger a new review!
Customizing Your Experience
Access your dashboard to:
- Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
- Change the review language.
- Add, remove or edit custom review instructions.
- Adjust other review settings.
Getting Help
- Contact our support team for questions or feedback.
- Visit our documentation for detailed guides and information.
- Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.
Would it make more sense to allow the user to specify a chat template --chat-templat /tmp/chat.template? And then have --chat-template none or --chat-template default?
Would it make more sense to allow the user to specify a chat template --chat-templat /tmp/chat.template? And then have --chat-template none or --chat-template default?
Yes, I think having a --chat-template-file <path> would be great and aligns with what llama-server does. This RamaLama CLI option would have the highest priority, followed by the extracted chat template and, finally, using the inference engines default template as a fallback. I think we discussed this at one point, but lost track of it.
When running within the container, the --chat-template option would have to volume mount the path into the container. This would complicate the use of quadlets and kube.yaml, but for now lets just add this and we would have to point out that this would need ot be handled within an image if a user put the AI into production. Potentially having the user ship the template within the container. --chat-template=none would just remove the --chat-template option from the inference engine.
A friendly reminder that this PR had no activity for 30 days.
@bmahabirbu Should this PR be closed or are you still working on it?
A friendly reminder that this PR had no activity for 30 days.