java-html-sanitizer
java-html-sanitizer copied to clipboard
Issue in replacemnt in url in achor tag href attr with html sanitization
I have one url like this below in html anchor tag.
<a href="https://xxx.com/qwert/ab_cdefmnp.php?pf=ppp_qqq&num_yyy=ZZZZZ">ZZZZ</a>
when I apply html sanitization why this value &num is replaced by # and the output html is like this below
<a href="https://xxx.com/qwert/ab_cdefmnp.php?pf=ppp_qqq#_yyy=ZZZZZ">ZZZZ</a>
which is became invalid.
I have used owasp in my project. How to avoid this change.
Any thought or suggestion would be appreciated.
hello I am sanitizer user. could you share your sanitizer policy?
We have used owasp with antisamy policy as well. we have the antisamy.xml
This code:
String out = Sanitizers.LINKS.sanitize(
"<a href=\"https://xxx.com/qwert/ab_cdefmnp.php?pf=ppp_qqq&num_yyy=ZZZZZ\">ZZZZ</a>");
Produces:
<a href="https://xxx.com/qwert/ab_cdefmnp.php?pf=ppp_qqq&num_yyy=ZZZZZ" rel="nofollow">ZZZZ</a>
Note that the "&num" has become "&num", and this is correct. On the other hand if the input had contains "...qqq#_yyy", then the additional ';' would have led to the entity being recognised as a '#', and that would also have been correct given the input.
Please provide a minimal reproducible example of the code you believe is producing incorrect output.