antisamy icon indicating copy to clipboard operation
antisamy copied to clipboard

AntiSamy fails to parse CSS with modern @media syntax, error: org.w3c.css.sac.CSSParseException: Identifier expected.

Open ashishkataria86 opened this issue 1 month ago • 7 comments

When sanitizing HTML that includes

Example input:

String dirtyInput = "<style>@media handheld, only screen and (max-device-width: 480px){div, a, p, td, th, li, dt, dd { -webkit-text-size-adjust: auto; }} </style>";
AntiSamy as = new AntiSamy();
CleanResults cr = as.scan(dirtyInput);

Observed exception:

org.w3c.css.sac.CSSParseException: Identifier expected.
    at org.apache.batik.css.parser.Parser.createCSSParseException (Parser.java:1687)
    at org.apache.batik.css.parser.Parser.createCSSParseException (Parser.java:1676)
    at org.owasp.validator.css.CssParser.parseMediaType (CssParser.java:175)
    at org.owasp.validator.css.CssParser.parseMediaQuery (CssParser.java:151)
    at org.owasp.validator.css.CssParser.parseMediaList (CssParser.java:113)
    at org.owasp.validator.css.CssParser.parseMediaRule (CssParser.java:227)
    at org.apache.batik.css.parser.Parser.parseStyleSheet (Parser.java:220)
    at org.owasp.validator.css.CssScanner.scanStyleSheet (CssScanner.java:170)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.processStyleTag (AntiSamyDOMScanner.java:465)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.actionValidate (AntiSamyDOMScanner.java:396)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.recursiveValidateTag (AntiSamyDOMScanner.java:298)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.processChildren (AntiSamyDOMScanner.java:712)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.processChildren (AntiSamyDOMScanner.java:704)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.scan (AntiSamyDOMScanner.java:177)
    at org.owasp.validator.html.AntiSamy.scan (AntiSamy.java:127)
    at org.owasp.validator.html.AntiSamy.scan (AntiSamy.java:105)
    at com.example.App.main (App.java:12)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:279)
    at java.lang.Thread.run (Thread.java:1474)

ashishkataria86 avatar Nov 01 '25 09:11 ashishkataria86

@spassarop Another issue for you to research.

davewichers avatar Nov 01 '25 15:11 davewichers

@media support was introduced when fixing issue #552. However, it was decided to not leave it "enabled" in the default policy. Batik-CSS should have implemented it but the workaround ended up being implemented party in AntiSamy's code and part in the anti-antythinggoes.xml policy.

In that XML, look for the CSS properties that start with _media and copy all of them in your policy file. Particularly, handheld is in your test case but not in the example policy, you should add it as a valid literal within the _mediatype property. Similarly with -webkit-....

Anyway, AntiSamy 1.7.8 has @media implemented and it raises no exceptions even if media properties are missing in the policy. Probably you are using a previous version.

spassarop avatar Nov 02 '25 23:11 spassarop

Hi @spassarop I copied all _media CSS properties from anti-antythinggoes.xml to my policy file. The issue is not related to the handheld media type, but to comma-separated media queries.

@media tv, screen and (max-width: 600px) {
  body { font-size: 14px; }
}
@media screen and (max-width: 600px), print and (min-resolution: 300dpi) {
  body { color: red; }
}

Observed exception: AntiSamy fails to parse any of the comma-separated media queries and throws the following exception:

org.w3c.css.sac.CSSParseException: Identifier expected.
    at org.apache.batik.css.parser.Parser.createCSSParseException (Parser.java:1687)
    at org.apache.batik.css.parser.Parser.createCSSParseException (Parser.java:1676)
    at org.owasp.validator.css.CssParser.parseMediaType (CssParser.java:175)
    at org.owasp.validator.css.CssParser.parseMediaQuery (CssParser.java:151)
    at org.owasp.validator.css.CssParser.parseMediaList (CssParser.java:113)
    at org.owasp.validator.css.CssParser.parseMediaRule (CssParser.java:227)
    at org.apache.batik.css.parser.Parser.parseStyleSheet (Parser.java:220)
    at org.owasp.validator.css.CssScanner.scanStyleSheet (CssScanner.java:170)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.processStyleTag (AntiSamyDOMScanner.java:465)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.actionValidate (AntiSamyDOMScanner.java:396)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.recursiveValidateTag (AntiSamyDOMScanner.java:298)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.processChildren (AntiSamyDOMScanner.java:712)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.processChildren (AntiSamyDOMScanner.java:704)
    at org.owasp.validator.html.scan.AntiSamyDOMScanner.scan (AntiSamyDOMScanner.java:177)
    at org.owasp.validator.html.AntiSamy.scan (AntiSamy.java:127)
    at org.owasp.validator.html.AntiSamy.scan (AntiSamy.java:105)
    at com.example.App.main (App.java:12)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:279)
    at java.lang.Thread.run (Thread.java:1474)

ashishkataria86 avatar Nov 03 '25 12:11 ashishkataria86

I have created a PR with a possible quick fix: https://github.com/nahsra/antisamy/pull/641 Please feel free to comment/review.

ashishkataria86 avatar Nov 06 '25 13:11 ashishkataria86

I still cannot reproduce the issue. I am using v1.7.8 tag and added the media policy fragments and everything is working fine. This is my test:

@Test
public void _testGithubIssue() throws ScanException, PolicyException {
  String input = "@media tv, screen and (max-width: 600px) {\n" +
          "  body { font-size: 14px; }\n" +
          "}";
  System.out.println(as.scan(input, policy, AntiSamy.DOM).getCleanHTML());
  System.out.println(as.scan(input, policy, AntiSamy.SAX).getCleanHTML());
  input = "@media screen and (max-width: 600px), print and (min-resolution: 300dpi) {\n" +
          "  body { color: red; }\n" +
          "}";
  System.out.println(as.scan(input, policy, AntiSamy.DOM).getCleanHTML());
  System.out.println(as.scan(input, policy, AntiSamy.SAX).getCleanHTML());
}

The output are:

@media tv, screen and (max-width: 600px) {   body { font-size: 14px; } }
@media tv, screen and (max-width: 600px) {   body { font-size: 14px; } }
@media screen and (max-width: 600px), print and (min-resolution: 300dpi)
{   body { color: red; } }
@media screen and (max-width: 600px), print and (min-resolution: 300dpi)
{   body { color: red; } }

The content is the same and I had no errors.

spassarop avatar Nov 22 '25 19:11 spassarop

@spassarop Your test passes because you are calling as.scan(input, policy, AntiSamy.DOM) with a string that contains only the @media rule. However, the failure happens when the exact same media query is part of a <style> element.

Here is an input that reproduces the issue:

<style>
@media tv, screen and (max-width: 600px) {
  body { font-size: 14px; }
}
</style>

So the behavior difference seems to stem from how AntiSamyDOMScanner.processStyleTag() delegates to the Batik parser, rather than from standalone CSS parsing.

ashishkataria86 avatar Nov 26 '25 08:11 ashishkataria86

That was dumb on my part. You’re right, I wrote the tests as if I was calling CSS parser directly. That’s one of the reasons why we ask for test code.

I cannot try it right now, but if the PR comes with written tests that support the changes it will be alright.

Il giorno mer 26 nov 2025 alle 05:20 Ashish Kataria < @.***> ha scritto:

ashishkataria86 left a comment (nahsra/antisamy#639) https://github.com/nahsra/antisamy/issues/639#issuecomment-3580049005

@spassarop https://github.com/spassarop Your test passes because you are calling as.scan(input, policy, AntiSamy.DOM) with a string that contains only the @media rule. However, the failure happens when the exact same media query is part of a

Here is an input that reproduces the issue:

So the behavior difference seems to stem from how AntiSamyDOMScanner.processStyleTag() delegates to the Batik parser, rather than from standalone CSS parsing.

— Reply to this email directly, view it on GitHub https://github.com/nahsra/antisamy/issues/639#issuecomment-3580049005, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHL3BMLROX6CZBT3OBSL43D36VPENAVCNFSM6AAAAACK3AVU7GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKOBQGA2DSMBQGU . You are receiving this because you were mentioned.Message ID: @.***>

spassarop avatar Nov 26 '25 11:11 spassarop