ILIAS
ILIAS copied to clipboard
`Refinery`: Transformation for escaping html/css/js
This PR introduces 5 new transformations to escape: Javascript, HTML, HTML attributes, CSS and URL's. The implementation is based upon this repository: https://github.com/laminas/laminas-escaper/
Hi @lscharmer,
I will assign this to you, so this does not appear on my agenda anymore.
Please assign back if this is ready for the next iteration.
Kind regards!
Hi @klees,
I updated this PR and addressed your points:
* [ ] CSS: Why do we need this transformation? Do we have a usecase for it?
In agreement with @mjansen, we decided that we don't and it is therefore removed from this PR :)
* [ ] HTMLAttributes: Why do we need this transformation? Do we have a usecase for it?
There are many places where PHP variables are passed to templates where they are used as HTML attributes. E.g. https://github.com/ILIAS-eLearning/ILIAS/blob/0514bac85249b86002d05915fcab7427e15d3a63/components/ILIAS/UI/src/Implementation/Component/Link/Renderer.php#L74 would be one use case where the value should be escaped as a HTML attribute.
* [ ] JavaScript: Why do we need this transformation? Do we have a usecase for it?
This is useful in places where JavaScriptBindable::withAdditionalOnLoadCode(...) is used to add JS from PHP. E.g. Here: https://github.com/ILIAS-eLearning/ILIAS/blob/0514bac85249b86002d05915fcab7427e15d3a63/components/ILIAS/UI/src/Implementation/Component/Button/Renderer.php#L94
* [ ] u: What does the `u` modifier in your regexps do? [This](https://github.com/ILIAS-eLearning/ILIAS/pull/6314/files#diff-92e0d450d8ec63ba77a6e87023294d0d2a6d9f061728da368f85befca262de78R39) seems to suggest that it somehow only matches UTF-8 strings, but I could not find according docu. Could you link some? Could you give me an example of a string that would make that expression fail?
The u
modifier is very important. The behaviour of the modifier is documented here: https://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
Additionally the first user contributed note on that page describes the behaviour with invalid UTF-8 strings with this modifier in detail.
I use the u
modifier to ensure that the given string is a valid UTF-8 string. IMO this is the most fail prove way to check for a valid UTF-8 string, as it is handled by PHP itself, instead of implementing this oneself. If you know a better way to check for a valid UTF-8 string I will change this of course.
The following expression will fail: preg_match('//u', "\x80")
but without the u
modifier it will not.
* [ ] Wording Group: I think what we do here is "Encoding" not "Escaping". "Escaping" means to have a certain character (say `\`) that makes the next character(s) behave differently than normal (make `\n` a linebreak). We "escape the n". Encoding on the other hand is about how data is represented. This is what we are doing here: Move data in UFT8-space to special character sequences that might contain escape characters. I would propose to rename the group to "Encode".
Done.
* [ ] Wording HTML: Please rename the transformation to "HTMLSpecialCharsAsEntities"
Done. I hope the documentation and links I added are enough, as this is just a wrapper around a builtin PHP function.
* [ ] Wording CSS: Please rename the transformation to "NonAlphaNumToCSSEscapes"
Is removed instead.
* [ ] Docu CSS: Please document what [this transformation](https://github.com/ILIAS-eLearning/ILIAS/pull/6314/files#diff-a6b677866a51479f2ca08ef1f729a9de1a7fc21b25cc9018f5ef27df19d1a72fR32) is supposed to do. Maybe link some source such as [this](https://www.w3.org/International/questions/qa-escapes#cssescapes).
Is removed instead.
* [ ] Docu HTML: Please document what [this transformation](https://github.com/ILIAS-eLearning/ILIAS/pull/6314/files#diff-92e0d450d8ec63ba77a6e87023294d0d2a6d9f061728da368f85befca262de78R32) is supposed to do. Maybe link some sources.
Done. I hope the documentation and links I added are enough, as this is just a wrapper around a builtin PHP function.
* [ ] Docu URL: Same as for HTML.
Done. I hope the documentation and links I added are enough, as this is just a wrapper around a builtin PHP function.
* [ ] Docu HTMLAttributes: Please document what [this transformation](https://github.com/ILIAS-eLearning/ILIAS/pull/6314/files#diff-b83583f7eb3a52d71a791c02009ccfe2a3a47bbeb67bd969b324d3105dc2997dR30) is supposed to do. Maybe link some sources. Judging on the examples in the tests, this transformation seem to do various things at once and, frankly, I cannot grasp how and where I would want to use it or what it is even supposed to do. I might ask to iterate on this further.
Done, this class is renamed to HTMLAttributeValue
. This transformation only encodes a given string to be a valid HTML attribute value. This just includes all of the requirements of the HTML spec.
There are just a lot of requirements.
I add documentation to elaborate on all of the requirements of an HTML attribute value and added links to the corresponding HTML specs where necessary and I renamed all variables to be more fitting to the domain, so one can search for these words online.
* [ ] Docu JavaScript: Same as for `HTMLAttributes`
After some research I noticed that this can be implemented much simpler. It uses json_encode
with correct flags now.
The transformation is renamed to JsonEncode
.
* [ ] Docu UTF8Chars: Please document what the methods in that trait are supposed to do.
Done, this trait is also renamed to ProvideUTF8CodepointRange
.
Best regards @lscharmer
@lscharmer Thanks for this PR. This is definitely needed. I would strongly suggest to extend the readme file https://github.com/ILIAS-eLearning/ILIAS/blob/trunk/components/ILIAS/Refinery/README.md with this PR, too, since this is what most devs will read. Outlining the main use cases would be very helpful.
@lscharmer Thanks for this PR. This is definitely needed. I would strongly suggest to extend the readme file https://github.com/ILIAS-eLearning/ILIAS/blob/trunk/components/ILIAS/Refinery/README.md with this PR, too, since this is what most devs will read. Outlining the main use cases would be very helpful.
Hi @alex40724,
the README.md
is adjusted accordingly.
I also renamed the JsonEncode
transformation to Json
as encode
is already in the group / path name ($refinery->enode()->json()
).
I added this to the JF agenda for educational purposes :-).
@lscharmer Thanks!
Jour Fixe, 27 MAY 2024: @mjansenDatabay notified us about the transformations for escaping html/css/js and asks all developers to use this methods for ILIAS 10. PR will be merged to trunk.