ILIAS icon indicating copy to clipboard operation
ILIAS copied to clipboard

`Refinery`: Transformation for escaping html/css/js

Open lscharmer opened this issue 1 year ago • 1 comments

This PR introduces 5 new transformations to escape: Javascript, HTML, HTML attributes, CSS and URL's. The implementation is based upon this repository: https://github.com/laminas/laminas-escaper/

lscharmer avatar Sep 12 '23 11:09 lscharmer

Hi @lscharmer,

I will assign this to you, so this does not appear on my agenda anymore.

Please assign back if this is ready for the next iteration.

Kind regards!

klees avatar Feb 06 '24 12:02 klees

Hi @klees,

I updated this PR and addressed your points:

* [ ]  CSS: Why do we need this transformation? Do we have a usecase for it?

In agreement with @mjansen, we decided that we don't and it is therefore removed from this PR :)

* [ ]  HTMLAttributes: Why do we need this transformation? Do we have a usecase for it?

There are many places where PHP variables are passed to templates where they are used as HTML attributes. E.g. https://github.com/ILIAS-eLearning/ILIAS/blob/0514bac85249b86002d05915fcab7427e15d3a63/components/ILIAS/UI/src/Implementation/Component/Link/Renderer.php#L74 would be one use case where the value should be escaped as a HTML attribute.

* [ ]  JavaScript: Why do we need this transformation? Do we have a usecase for it?

This is useful in places where JavaScriptBindable::withAdditionalOnLoadCode(...) is used to add JS from PHP. E.g. Here: https://github.com/ILIAS-eLearning/ILIAS/blob/0514bac85249b86002d05915fcab7427e15d3a63/components/ILIAS/UI/src/Implementation/Component/Button/Renderer.php#L94

* [ ]  u: What does the `u` modifier in your regexps do? [This](https://github.com/ILIAS-eLearning/ILIAS/pull/6314/files#diff-92e0d450d8ec63ba77a6e87023294d0d2a6d9f061728da368f85befca262de78R39) seems to suggest that it somehow only matches UTF-8 strings, but I could not find according docu. Could you link some? Could you give me an example of a string that would make that expression fail?

The u modifier is very important. The behaviour of the modifier is documented here: https://www.php.net/manual/en/reference.pcre.pattern.modifiers.php Additionally the first user contributed note on that page describes the behaviour with invalid UTF-8 strings with this modifier in detail. I use the u modifier to ensure that the given string is a valid UTF-8 string. IMO this is the most fail prove way to check for a valid UTF-8 string, as it is handled by PHP itself, instead of implementing this oneself. If you know a better way to check for a valid UTF-8 string I will change this of course. The following expression will fail: preg_match('//u', "\x80") but without the u modifier it will not.

* [ ]  Wording Group: I think what we do here is "Encoding" not "Escaping". "Escaping" means to have a certain character (say `\`) that makes the next character(s) behave differently than normal (make `\n` a linebreak). We "escape the n". Encoding on the other hand is about how data is represented. This is what we are doing here: Move data in UFT8-space to special character sequences that might contain escape characters. I would propose to rename the group to "Encode".

Done.

* [ ]  Wording HTML: Please rename the transformation to "HTMLSpecialCharsAsEntities"

Done. I hope the documentation and links I added are enough, as this is just a wrapper around a builtin PHP function.

* [ ]  Wording CSS: Please rename the transformation to "NonAlphaNumToCSSEscapes"

Is removed instead.

* [ ]  Docu CSS: Please document what [this transformation](https://github.com/ILIAS-eLearning/ILIAS/pull/6314/files#diff-a6b677866a51479f2ca08ef1f729a9de1a7fc21b25cc9018f5ef27df19d1a72fR32) is supposed to do. Maybe link some source such as [this](https://www.w3.org/International/questions/qa-escapes#cssescapes).

Is removed instead.

* [ ]  Docu HTML: Please document what [this transformation](https://github.com/ILIAS-eLearning/ILIAS/pull/6314/files#diff-92e0d450d8ec63ba77a6e87023294d0d2a6d9f061728da368f85befca262de78R32) is supposed to do. Maybe link some sources.

Done. I hope the documentation and links I added are enough, as this is just a wrapper around a builtin PHP function.

* [ ]  Docu URL: Same as for HTML.

Done. I hope the documentation and links I added are enough, as this is just a wrapper around a builtin PHP function.

* [ ]  Docu HTMLAttributes: Please document what [this transformation](https://github.com/ILIAS-eLearning/ILIAS/pull/6314/files#diff-b83583f7eb3a52d71a791c02009ccfe2a3a47bbeb67bd969b324d3105dc2997dR30) is supposed to do. Maybe link some sources. Judging on the examples in the tests, this transformation seem to do various things at once and, frankly, I cannot grasp how and where I would want to use it or what it is even supposed to do. I might ask to iterate on this further.

Done, this class is renamed to HTMLAttributeValue. This transformation only encodes a given string to be a valid HTML attribute value. This just includes all of the requirements of the HTML spec. There are just a lot of requirements. I add documentation to elaborate on all of the requirements of an HTML attribute value and added links to the corresponding HTML specs where necessary and I renamed all variables to be more fitting to the domain, so one can search for these words online.

* [ ]  Docu JavaScript: Same as for `HTMLAttributes`

After some research I noticed that this can be implemented much simpler. It uses json_encode with correct flags now. The transformation is renamed to JsonEncode.

* [ ]  Docu UTF8Chars: Please document what the methods in that trait are supposed to do.

Done, this trait is also renamed to ProvideUTF8CodepointRange.

Best regards @lscharmer

lscharmer avatar Mar 19 '24 10:03 lscharmer

@lscharmer Thanks for this PR. This is definitely needed. I would strongly suggest to extend the readme file https://github.com/ILIAS-eLearning/ILIAS/blob/trunk/components/ILIAS/Refinery/README.md with this PR, too, since this is what most devs will read. Outlining the main use cases would be very helpful.

alex40724 avatar Apr 17 '24 13:04 alex40724

@lscharmer Thanks for this PR. This is definitely needed. I would strongly suggest to extend the readme file https://github.com/ILIAS-eLearning/ILIAS/blob/trunk/components/ILIAS/Refinery/README.md with this PR, too, since this is what most devs will read. Outlining the main use cases would be very helpful.

Hi @alex40724, the README.md is adjusted accordingly.

I also renamed the JsonEncode transformation to Json as encode is already in the group / path name ($refinery->enode()->json()).

lscharmer avatar Apr 19 '24 08:04 lscharmer

I added this to the JF agenda for educational purposes :-).

mjansenDatabay avatar May 23 '24 11:05 mjansenDatabay

@lscharmer Thanks!

alex40724 avatar May 27 '24 11:05 alex40724

Jour Fixe, 27 MAY 2024: @mjansenDatabay notified us about the transformations for escaping html/css/js and asks all developers to use this methods for ILIAS 10. PR will be merged to trunk.

matthiaskunkel avatar May 27 '24 11:05 matthiaskunkel