llama-cpp-python Chat template rendering extensions to match transformers

I've been using the jinja2 tojson filter in a couple of chat templates to render function calling instructions, which is working very well, however sometimes this will introduce escaped unicode characters, which is undesirable.

This PR ~~simply changes the default parameters of tojson~~ replaces tojson so that the JSON is rendered in unicode without escaped HTML characters.

Examples of models using tojson:

I've also removed the leftover (and unnecessary) loader parameter since I was touching this code area anyway.

Update:

Further changes to chat templates are being made in transformers and since they are dependant on the changes already made here I will simply append them to this PR:

Added loopcontrols jinja2 extension (adds break and continue)
Added strftime_now to render the current time

See huggingface/transformers#32684

May 26 '24 06:05 CISC

Since the merging of a similar patch to transformers and the formalizing of tools (see references in #1336) there are starting to pop up a lot more models using tojson in their chat templates.

Jun 20 '24 20:06 CISC

@abetlen Llama 3.1 is the latest model to use tojson, would be nice to have matching behaviour as transformers.

Aug 01 '24 14:08 CISC

@abetlen Even more transformers changes incoming, see updated OP.

Aug 15 '24 08:08 CISC

@abetlen It would be really nice to have this merged soon, it's starting to cause issues with quite a few templates not to have this in place.

Additionally this is a requirement for implementing inverse templates (which I'd like to do, as soon as it's merged in transformers, since it will be super useful for universal function calling and parsing).

Oct 08 '24 18:10 CISC

Hello I support to merge it. I have issues with Mistral models (devstral too)

Jun 13 '25 16:06 serhii-nakon

@abetlen I see strftime_now support was finally added, unfortunately that is not enough, PTAL.

Aug 08 '25 09:08 CISC