owasp-java-encoder icon indicating copy to clipboard operation
owasp-java-encoder copied to clipboard

Combining OWASP Sanitizer and Encoder

Open bmscodespace opened this issue 1 year ago • 4 comments
trafficstars

Hi,

is it possible to combine the OWASP Sanitizer and the OWASP Encoder to not remove malicious code but to encode the problematic parts from a given string, so that f.e. a script tag will do no harm and is just displayed as a text. I am asking this because I would like to deal with texts where it is not certain if they will be displayed as inner html or as "normal text".

Thank you very much for any answer ;)

bmscodespace avatar Jan 19 '24 10:01 bmscodespace

I think this would be a great idea. Neither library is that large so combining them would make sense + 1

melloware avatar Jan 19 '24 13:01 melloware

If the content is data that you want to display exactly like a user typed it in safely, then I would use the encoder.If the content is HTML that you actually want to render that’s authored by a user then you want to use the HTML sanitizer.Does that make sense to you?--Jim @./manicodeSecure Coding EducationOn Jan 19, 2024, at 5:34 AM, bmscodespace @.> wrote: Hi, is it possible to combine the OWASP Sanitizer and the OWASP Encoder to not remove malicious code but to encode the problematic parts from a given string, so that f.e. a script tag will do no harm and is just displayed as a text. I am asking this because I would like to deal with texts where it is not certain if they will be displayed as inner html or as "normal text". Thank you very much for any answer ;)

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

jmanico avatar Jan 19 '24 13:01 jmanico

@jmanico it totally does I just find a lot of people use both these libs one for santizing HTML input and the others for sanitizing output before its send back to the browser like JSON data etc. I know in PrimeFaces we use both libraries.

melloware avatar Jan 19 '24 14:01 melloware

Hi,

thank you for your comments. My question imagined a scenario where we don't know if a text will be displayed as inner HTML, f.e. as formatted text with lots of p tags or b tags in it, or as an ordinary data text that was f.e. typed in safely. If I sanitize the text then this might destroy a text like f.e.

A script in HTML starts with <script> and ends with </script> .

On the other hand, if I encode every string, a HTML string which we might want to display as formatted text will then be displayed as a HTML string with possible code from an attacker in it ;).

bmscodespace avatar Jan 24 '24 17:01 bmscodespace

Encoding must be done at the point of output. Otherwise you run into the problem of using the wrong encoding.

jeremylong avatar Jul 26 '24 11:07 jeremylong