ramalama icon indicating copy to clipboard operation
ramalama copied to clipboard

feat: added new flag default-template for models to use tools

Open bmahabirbu opened this issue 3 months ago • 6 comments

Summary by Sourcery

Add a --default-template flag to skip loading model-specific chat template files and use the runtime's default chat template instead, wiring this option through the CLI, service command factory, model container generation, and documentation.

New Features:

  • Introduce --default-template CLI option for run and serve commands
  • Apply default_template flag to skips adding model-specific chat-template-file in llama serve and container config generation
  • Honor default_template flag in daemon command factory when building llama serve commands

Enhancements:

  • Refactor chat template path handling to conditionally include template based on default_template flag
  • Unify chat template paths into a single tuple or None in container config generation

Documentation:

  • Document --default-template option in ramalama-run and ramalama-serve man pages

bmahabirbu avatar Sep 18 '25 19:09 bmahabirbu

Reviewer's Guide

Introduces a new --default-template flag that, when enabled, bypasses model-specific chat template file handling across runtime commands, container config generation, and daemon service invocations by conditionally skipping template path retrieval and passing None; argument parsing and documentation are updated accordingly.

File-Level Changes

Change Details Files
Gate chat template file inclusion in llama_serve based on default_template flag
  • Extract args.default_template into use_default_template
  • Wrap chat_template_path retrieval and exec_args append in "if not use_default_template"
ramalama/model.py
Refactor generate_container_config to conditionally prepare chat_template_paths
  • Check args.default_template to set chat_template_src/dest to None or call _get_chat_template_path
  • Build chat_template_paths tuple or None
  • Replace direct chat_template_src_path/chat_template_dest_path in generate calls
ramalama/model.py
Apply default-template gating in service command factory
  • Fetch default_template from request_args
  • Wrap chat_template_path lookup and cmd append in "if not use_default_template"
ramalama/daemon/service/command_factory.py
Add --default-template parser option for run and serve commands
  • Add parser.add_argument for --default-template with store_true action
  • Scope flag to run and serve command contexts
ramalama/cli.py
Document the new --default-template option in man pages
  • Add description of default-template flag in ramalama-run.1.md
  • Add description of default-template flag in ramalama-serve.1.md
docs/ramalama-run.1.md
docs/ramalama-serve.1.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

  • Contact our support team for questions or feedback.
  • Visit our documentation for detailed guides and information.
  • Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai[bot] avatar Sep 18 '25 19:09 sourcery-ai[bot]

Would it make more sense to allow the user to specify a chat template --chat-templat /tmp/chat.template? And then have --chat-template none or --chat-template default?

rhatdan avatar Sep 18 '25 20:09 rhatdan

Would it make more sense to allow the user to specify a chat template --chat-templat /tmp/chat.template? And then have --chat-template none or --chat-template default?

Yes, I think having a --chat-template-file <path> would be great and aligns with what llama-server does. This RamaLama CLI option would have the highest priority, followed by the extracted chat template and, finally, using the inference engines default template as a fallback. I think we discussed this at one point, but lost track of it.

engelmi avatar Sep 19 '25 08:09 engelmi

When running within the container, the --chat-template option would have to volume mount the path into the container. This would complicate the use of quadlets and kube.yaml, but for now lets just add this and we would have to point out that this would need ot be handled within an image if a user put the AI into production. Potentially having the user ship the template within the container. --chat-template=none would just remove the --chat-template option from the inference engine.

rhatdan avatar Sep 22 '25 11:09 rhatdan

A friendly reminder that this PR had no activity for 30 days.

github-actions[bot] avatar Oct 23 '25 00:10 github-actions[bot]

@bmahabirbu Should this PR be closed or are you still working on it?

rhatdan avatar Nov 03 '25 13:11 rhatdan

A friendly reminder that this PR had no activity for 30 days.

github-actions[bot] avatar Dec 04 '25 00:12 github-actions[bot]