Twig icon indicating copy to clipboard operation
Twig copied to clipboard

"double_encode: false" in html escaping strategy

Open m-vo opened this issue 4 years ago • 3 comments

Twig's default html escaping strategy will double encode whatever entities are present in the original source. This is the expected default behavior of htmlspecialchars which is used under the hood.

For most use cases this is perfectly fine. If, however, double encoding must be prevented it's very cumbersome to achieve (lots of reimplementing needed). In my case, data is partly coming from a legacy system that still uses input encoding. I do not want to apply |raw everywhere especially because then switching to output encoding at one point in time would become impossible.

Just setting double_encode to false in the htmlspecialchars calls would probably be safe to do but could be considered a BC break. It would be very handy if this could either be configured somehow or if there was an easier way to achieve it in a framework/application. Maybe I'm, also just not seeing the latter right now and it's already possible. :relaxed:

So to recap: I'd like to achieve that Twig (nice!) won't be transformed to Twig (nice!) but stays the same, effectively displaying: "Twig (nice!)" :slightly_smiling_face:

m-vo avatar Apr 29 '21 13:04 m-vo

Well, the issue is that setting double_encode: false globally in the HTML escaper would allow injecting HTML entities in the output for any code that does not pre-escape &. So it might not actually be safe.

stof avatar Apr 29 '21 13:04 stof

Yeah, my intention wasn't to set it globally but be able to control it programmatically. In the end I'd like to prevent double encoding input I know is already encoded without the need to apply a raw filter in the templates. (This way there is an upgrade path once the input encoding goes away.)

I can probably write my own escaper that does just that and probably use a node visitor + custom node to alter the output. But that's a lot of fiddling for this problem. :-/

m-vo avatar May 03 '21 16:05 m-vo

… would allow injecting HTML entities in the output for any code that does not pre-escape &. So it might not actually be safe.

Injecting HTML entities can be safe depending on the context I think (at least for our use case we consider it safe) but we want to prevent the injection of any other special HTML characters like < and >.

While this might be a rather project specific use case it would be great if there was a way to configure or replace the html escaper.

ausi avatar May 07 '21 10:05 ausi