jinja icon indicating copy to clipboard operation
jinja copied to clipboard

Allow custom autoescape functions as result of the autoescape function parameter for the Environment

Open CarliJoy opened this issue 4 years ago • 4 comments

I am currently trying to build an app that generates PDFs based on LaTeX imports. For this I use django extensions that renders the LaTeX file and also provides the possibility to escape single variables.

Every variable that is given to the templates system can be modified by the users, so I would need to escape every single occurrence, which is error prone.

Therefore using the autoescape feature of Jinja would be great.

Reading #503, which was closed unsuccessfully there are currently two way to implement this kind of autoescape

  1. Writing an Extension that overwrites the filter_stream and and adds a | my_custom_filter add the end of every variable. (used by JinjaSQL)
  2. Overwrite the Jinja default escape function by a custom one (used by Jinja Vanish)

Both solution feel hacky and unstable.

That why I would suggest to expand the autoescape parameter from the Environment to accept not only functions returning a boolean but also a custom escape function.

Like this

from typing import Union, Callable
from jinja2 import Environment, PackageLoader

def do_latex_escape(value: str) -> str:
    return (
        value.replace("&", "\\&")
        .replace("$", "\\$")
        .replace("%", "\\%")
        .replace("#", "\\#")
        .replace("_", "\\_")
        .replace("{", "\\{")
        .replace("}", "\\}")
    )


def smart_autoescape(template_name: str) -> Union[bool, Callable[[str], str]] :
    if template_name.endswith(('.html', '.htm', '.xml')):
        return True  # Use Default Escape Function
    if template_name.endswith(('.latex', '.tex', '.luatex')):
        return do_latex_escape  # for LaTeX Files escape LaTeX
    return False

env = Environment(autoescape=smart_autoescape,
                  loader=PackageLoader('mypackage'))

I would expect the following to happen in the above case:

  • Given a HTML template
    • Autoescape and everything as normal
  • Given a template with any latex extension
    • autoescape is enabled`
    • the function do_latex_escape is wrapped to it does not escape safe text (example
    • the default escape filter (| e) and (| escape) is replaced by the do_latex_escape filter
    • autoescaping is done with the new filter
  • Everything else
    • Autoescape is disabled

What do you think about the suggestion? If you like it, I would try to create a PR. If I understand it correctly Jinja Vanish already provides some good starting point for it.

CarliJoy avatar Apr 01 '21 18:04 CarliJoy

I like this idea, it seems like a decent way to expand autoescape without needing to change Jinja itself much. If you want to work on a PR try to get it in soon if you want this in 3.0.

davidism avatar Apr 04 '21 23:04 davidism

@davidism I looked a bit in the source code and it seems HTML escaping is quite strong hardcoded in the source code due to the use of markupsafes Markup class. This is a bit surprising as both documentation and homepage mention that Jinja is explicitly not only focused on HTML templates and even mentions it for usage with LaTeX.

But the Markup class uses its internal (html) escape function i.e. if the modulo operator %.

I already was successful in creating a patch that implements the mentioned changes but they seem to be quite unstable as I already found some edges cases in which unexpected behaviour occurs.

The only way I see in making this feature really reliable is in removed the hardcoding to the (HTML) Markup class by removing the hardcoded Markup class creations.

Therefore I created #1382 to discuss the idea.

Looking through the source code it seems manageable and I would give it a try. But not if this is considered a too big change in the source code...

Could you specify "soon" a bit? I really would like to see this feature in a release soon and I am willing to put some work into it.

CarliJoy avatar Apr 04 '21 23:04 CarliJoy

MarkupSafe implements a string subclass that intercepts all operations and applies escaping again. See https://en.wikipedia.org/wiki/Taint_checking for the concept. It should already be possible to subclass this and override the escape method to change what is escaped.

All Jinja should need is the equivalent of a Markup class and escape function. Jinja uses nodes.EvalContext to track whether autoescape is enabled for the current context, but always uses Markup and escape right now. The eval context could be extended to also hold the two callables to use. The compiler and probably a few other places would need to use that instead of the hard coded calls.

Soon as in the next few days. I'm working to finish merging existing PRs now before making a new 3.0 release candidate. It can always go in 3.1, and may end up there anyway depending on what I can get to.

davidism avatar Apr 05 '21 00:04 davidism

I just added a bit more explanation to the documentation so end users won't run in possible pitfalls. Gave me best to explain it as good and detailed as possible. But given I am bit dyslexic it would be good if somebody could read the added documentation again.

This is a task that could be done also be non technical people :-)

CarliJoy avatar Apr 09 '21 12:04 CarliJoy