askama icon indicating copy to clipboard operation
askama copied to clipboard

recognize a template instance and don't bother escaping it

Open droundy opened this issue 5 years ago • 21 comments

A lovely feature would be to figure out how to recognize that an expression is a type that has derived Template, and then not escape its output (assuming both templates involved are html). I'm not sure how to implement this, but it took me a while to figure out that I needed {{ var|safe }} to avoid having the html escaped.

I can't imagine a scenario where you'd have a type with an html template, and then want that output escaped when the variable is shown in another template.

droundy avatar Sep 01 '18 04:09 droundy

Yeah, this is a bit tricky. It's been part of the contract that Template implementors also implement Display, and so without specialization I'm not sure it's possible to do have both of those features.

A somewhat similar issue comes up with non-stringy types (like numbers) which ideally you wouldn't escape for performance reasons.

One way this maybe could work is if there is a crate-internal trait SafeToInclude (bikeshed needed) which is implemented for known-safe types like Template and i* and u* and other things which only defers to Display if there is no other implementation.

djc avatar Sep 01 '18 08:09 djc

Yeah, specialization does seem it will/would be the "right" way to do this. I'm not sure how your crate-internal SafeToInclude trait would work. You couldn't define it for all possible types, so you'd still be left with the same problem in determining if it were to work properly, right?

Unless you were to restrict the template expansion to only accept expressions with type SafeToInclude, and then users would be expected to implement that trait for any non-Template types that they want to use in their templates? That does seem practical, and would have the added feature that users could enable their custom types to have a different Display appearance from their included appearance, which actually seems like a real plus. Several of my tiny templates really could have been rust code even more cleanly.

Actually, stepping back, couldn't you just use Template itself to handle SafeToInclude? If you implemented Template on all the standard concrete types, and then required expression values to implement Template, then you'd need no extra bikeshedding. The only downside I see is that you might want to have some concrete types have distinct rendering in multiple formats, e.g. html versus latex, imagining that you supported latex files (with escaping) in the future. But perhaps that could just work by abusing the extension method?

droundy avatar Sep 01 '18 12:09 droundy

I've been wondering if you'd be open to a breaking change on this, obviously with an accompanying version bump?

The more I look at this, it seems to be a kind of type safety bug. Specifically, askama fails to distinguish the "type" of a string, as in whether it is HTML or plain Unicode, or LaTeX or whatever. And it neither makes this distinction at compile time nor at runtime, which means that if you ever miss a "safe" you get incorrect escaping, and if you ever have an extra "safe" then you may have a security be hole at worst and a poor display at best.

When I think of the problem in this way it seems like I'd want to put the error at compile time, which means a breaking change. I have some ideas for a new API but want to hear first whether in principle you'd be open to a breaking change.

droundy avatar Nov 04 '18 17:11 droundy

Definitely not opposed to API change per se, but I haven't been able to come up with a good idea so far. So, I'd like to hear/see your ideas!

djc avatar Nov 04 '18 17:11 djc

I see two issues that I'd like to change. One is using Display to insert variables, which means the representation cannot be specific to the format and escaping is required. The second is that a given type can only have a Template for one format. So we cannot have e.g. a separate implementation for HTML and TeX.

My current thinking is to add a type parameter to the Template class, so it would be something like Template<HTML>. This would enable us to have different templates for different formats, e.g. if I want to support both HTML and LaTeX rendering of the same thing. Then the implementation would expand a variable such as {{ foo }} in HTML context by using the Template<HTML> implementation for the variable foo. If you want to use the Display implementation instead, you'd have to specify {{ foo | safe }} or {{ foo ! escape }} or something.

Askama would have some set of built in formats (HTML, maybe JavaScript, maybe LaTeX math mode and LaTeX text mode?) and would implement Template for those types for all the standard types for which it makes sense. And each format could be specified work something like {{ foo | latexmath }}, which would be handy on web pages using mathjax for instance.

That's most of what I'm thinking at the moment. This would break any templates the rely on the current behavior using Display, but I think that is good. Using display by default is the thing that I'm changing, precisely because it is ambiguous.

droundy avatar Nov 04 '18 22:11 droundy

So do you have an actual case for templating the same type into different content types? If so, how do you think that should look in terms of attributes for such a type? It might actually be more relevant to you that you can have different escape modes in a single template buffer, which would need different things.

There's also some overlap with the discussion in #136, if you had not seen that yet.

djc avatar Nov 05 '18 12:11 djc

The simple example of a type that could benefit from multiple mixed representations would be f64. In latex math mode (which I might want to use within HTML using mathjax) I'd want to represent 1e40 as "10^{40}". In latex text mode I'd want this to be "$10^{40}$". In no case would I want "10000000000000000000000000000000000000000."

Of course, I don't expect a template to be defined for f64. To find an example where you would want multiple templates, we need a case where you use two formats together. One case that has been on my mind recently would be a homework database for Physics (we are soon to implement one). We would want to have homework problems in some format, some in LaTeX and sadly some in Word. Users would want to pick a set of homework problems and generate a problem set PDF as well as a solutions PDF. When picking the problems you'd want them displayed in HTML, in some cases as an image but better to use mathjax when possible. You'd also want an online version of the problem set in HTML, as well as the print version done in LaTeX. So the same information will need to be put into a variety of different formats. One could use different data types for each format, or could embed the format in the data type at every level, or use methods rather than templates to generate the representations. But it seems like it would be nicest to have the data type simply describe the content and let the templates describe the representation of it.

P.S. This was written on my phone so I apologise for any resulting unclarity.

droundy avatar Nov 06 '18 15:11 droundy

So Askama internally also has the MarkupDisplay type. I think this might be a better fit for the parametrization by output type that you described.

In general I want Askama to be powerful enough to support the type of advanced stuff you're talking about here, but it's also important to me that the simple cases it covers today don't become much harder (following the "simple things easy, complex things possible" adage). But, realistically, getting there will require some exploration of the problem and solution space which I won't be able to drive myself (except maybe if you can/want to hire me to do so), though I can guide it along towards a direction that I'm willing to support in Askama itself.

djc avatar Nov 06 '18 15:11 djc

I agree that my proposal would also address #136 by enabling different escaping for different formats.

droundy avatar Nov 06 '18 15:11 droundy

https://github.com/rust-lang/rust/issues/31844

zzau13 avatar Nov 10 '18 18:11 zzau13

How would you feel about something like the following as a solution:

pub trait MarkupDisplayable<E: Escaper> {
    type Repr: Display;
    fn display_value(self) -> DisplayValue<Self::Repr>;
}

pub struct MarkupDisplay<E: Escaper, T: MarkupDisplayable<E>> {
    value: T,
    escaper: E,
}

MarkupDisplayable has no blanket impl, but can be derived for any type that implements Display, and is implemented for all the standard library types that implement Display. If you have a type of your own that you want to display in an askama template, you can derive MarkupDisplayable or implement it yourself for whatever escapers you want to use it with. The only change needed to make old code work with this new version is to add MarkupDisplayable to the derives on whatever types you've been using in templates, and now you can define types that have their own custom escaping routines or no escaping at all. Then once specialization is ready for general use, MarkupDisplayable can get a default implementation for all Display types.

xenofem avatar Feb 15 '21 20:02 xenofem

Actually realizing that migrating old code to work with this change may be more annoying if you're displaying foreign types in Askama; you'd either need to make a newtype, or include a filter that handles types that implement Display but not MarkupDisplayable :/ Maybe the default should be for MarkupDisplayable to get its blanket implementation, and there can be a feature that removes the blanket implementation for those who want custom implementations on their own types?

xenofem avatar Feb 15 '21 20:02 xenofem

Without specialization in the language, I'm not sure there's going to be a design here that I find satisfying in terms of cost/benefit ratio. I think we could have some trait that we implement for well-known types (like relevant std types and maybe any type which implements Template) and that would go a long way. I guess downstream crates could also implement that trait for their own types?

djc avatar Feb 16 '21 08:02 djc

If you were willing to drop the requirement that Template implement Display then you could use some other trait, say Renderable as the thing you call when rendering expressions, pass though any necessary context and implement it for everything else that implements Display. You could still implement ToString for the templates for easy use in other contexts.

// Example of some context that might be wanted by inner-templates
enum Escaping {
    None,
    Html,
}

trait Renderable {
    fn render(&self, escaping: Escaping) -> String;
}

impl<T> Renderable for T
where 
    T: std::fmt::Display
{
    fn render(&self, escaping: Escaping) -> String {
        self.to_string()
    }
}

struct ATemplateImpl {
}

impl Renderable for ATemplateImpl {
    fn render(&self, escaping: Escaping) -> String {
        "hello".to_string()
    }
}

impl ToString for ATemplateImpl {
    fn to_string(&self) -> String {
        self.render(Escaping::Html)
    }
}

Mossop avatar Jun 15 '23 13:06 Mossop

@Mossop any particular reason this has come up for you now?

I feel like at this point templates implementing Display is a substantial part of the API contract, and breaking that would cause quite a bit of downstream churn. I'd like to see a serious exploration of the options around deref-based specialization first before we consider that.

djc avatar Jun 15 '23 13:06 djc

I only just started using asakama and I hit this bug almost immediately so figured I'd make a suggestion 🤷

Mossop avatar Jun 15 '23 13:06 Mossop

One major problem I see is that this would break current use cases which assume that Templates get escaped, too.

Maybe I am using {{{ var }}} in the inner template, because I know that the outer template is going to escape my data. Or I am including the inner template with {{{ tmpl }}}, because I know that the inner template already has escaped all user supplied data.

IMO using an explicit {{{ raw }}} expressions is the best option. Printing double escaped data is a little bug, printing unescaped data is quite a bit worse.

Kijewski avatar Jun 15 '23 15:06 Kijewski

autoref-specialization seems like it could be viable here? (Depending on what the generated code looks like)

Imberflur avatar Apr 17 '24 22:04 Imberflur

Yes, it could. I've played with that once but didn't get it to work but if someone wants to spend time on it, would be cool!

djc avatar Apr 18 '24 07:04 djc

The advantage would be, that we could possibly mark arbitrary types as safe, so the HTML escaper has to be called less often, e.g. when printing numbers.

Kijewski avatar Apr 18 '24 11:04 Kijewski

Yeah, could provide a nice performance benefit.

djc avatar Apr 18 '24 12:04 djc