PHP-CSS-Parser icon indicating copy to clipboard operation
PHP-CSS-Parser copied to clipboard

Add support for nested selectors

Open ocoste opened this issue 8 months ago • 14 comments

When I parse a CSS with nested selector, the rendered CSS bugs

Example :

$css = "
.presentations_frameEditorItem {
    position: relative;
    a {
        border: 1px solid var(--theme-primary-border);
    }

    &.presentations_isDragging {
        opacity: 0
    }
}
";

$parser = (new Parser($css))->parse();

$css = $parser->render();

.presentations_frameEditorItem {
position: relative;
border: 1px solid var(--theme-primary-border);
}

How can we prevent this?

Thanks

ocoste avatar Mar 15 '25 13:03 ocoste

Thanks for reporting this!

This seems to be a missing feature. I've tagged and renamed this issue accordingly.

oliverklee avatar Mar 15 '25 14:03 oliverklee

This shouldn't be too tricky to implement. These are my initial thoughts:

  • RuleSet::$rules elements would need to be allowed to be either a Rule or a DeclarationBlock:
    • A (possibly-empty) interface RuleOrDeclarationBlock (for want of a better name) can be introduced for type-strictness;
    • This may allow creation of invalid constructs, such as a declaration block inside an at-rule where it is not permitted, but it would be caveat emptor - invalid constructs would be rejected at the parsing phase, and only possible via manipulation;
  • RuleSet::parseRuleSet() would need an extra Boolean parameter to permit declaration blocks as well as property-value rules:
    • Only DeclarationBlock::parse() would pass this as true;
    • Determining whether the next item is a rule or a nested declaration block is slightly tricky:
      • The next characters of interest that would determine which it is are { (for a declaration block) and ; or } for a rule;
      • But these could be contained within a string in either situation (e.g. content property for a rule, or attribute selector), so proper parsing is required;
      • We could attempt to initially parse as a rule, but a:hover would pass unless we then checked for a { coming later;
      • Or we could attempt to initially parse as a declaration block, which should always fail if there's no {;
      • I think the latter seems more reliable;
      • Either way, I think it would involve catching an exception during normal parsing operation, which I'm not keen on, because exceptions should only be used for exceptional situations, but I can't think of a way of easily avoiding that.
  • Rendering should take care of itself and need no changes.

@oliverklee, @sabberworm, WDYT?

JakeQZ avatar Mar 16 '25 01:03 JakeQZ

Though RuleSet shouldn't need to know about DeclarationBlock, as the latter is a subclass (and if it does, there's a circular dependecy). Instead it should have methods that DeclarationBlock can override.

JakeQZ avatar Mar 16 '25 02:03 JakeQZ

  • A (possibly-empty) interface RuleOrDeclarationBlock (for want of a better name) can be introduced for type-strictness

RuleSetItem?

JakeQZ avatar Mar 16 '25 02:03 JakeQZ

Ref: https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_nesting

JakeQZ avatar Mar 16 '25 02:03 JakeQZ

RuleSetItem?

I like that. (And it should definitely be an interface, not a class.)

RuleSet::parseRuleSet() would need an extra Boolean parameter to permit declaration blocks as well as property-value rules:

I consider boolean parameters that change the general way a method works an anti-pattern. Instead, I propose we add a second, well-named method.

oliverklee avatar Mar 16 '25 09:03 oliverklee

RuleSet::parseRuleSet() would need an extra Boolean parameter to permit declaration blocks as well as property-value rules:

I consider boolean parameters that change the general way a method works an anti-pattern. Instead, I propose we add a second, well-named method.

I think RuleSet::parseRuleSet() would need to call an overrideable method parseRuleSetItem, which DeclarationBlock can override to parse a nested declaration block if that's what's next, whilst the base RuleSet only parses Rules. There's a potential issue with these methods being static, and I'd prefer to avoid late-static binding if possible. Will see what the situation is when I look into implementation. But this approach should avoid the anti-pattern you mention, as well as a the circular dependency I mentioned.

JakeQZ avatar Mar 17 '25 00:03 JakeQZ

This is the code smell/antipattern I was referring to: https://luzkan.github.io/smells/flag-argument

oliverklee avatar Mar 17 '25 10:03 oliverklee

  • A (possibly-empty) interface RuleOrDeclarationBlock (for want of a better name) can be introduced for type-strictness

RuleSetItem?

Actually, this isn't going to work like that (at all easily). RuleSet uses the Rules' property name as a key, and various methods use this to identify them. DeclarationBlocks don't have a property name.

Instead I think DeclarationBlock will need a separate list of DeclarationBlock children. When it comes to rendering, it won't matter if, say, all the Rules are output before all the DeclarationBlock children, because (I think) the children can't match any of the same elements (but will need to think a bit more to be sure).

JakeQZ avatar Mar 17 '25 19:03 JakeQZ

When it comes to rendering, it won't matter if, say, all the Rules are output before all the DeclarationBlock children, because (I think) the children can't match any of the same elements (but will need to think a bit more to be sure).

Actually the order can matter, in this contrived example using the universal selector:

body,
body * {
  * {
    color: red;
  }
  color: green;
}

Child elements of body will be coloured green. But if the nested declaration block comes after the rule, they will be red, because the universal selector does not affect specificity. Note that if * is changed to, say, p, the nested rule will always take precedence (if matched), since the outer specificity is taken from the most-specific selector, whether or not it is actually matched.

This is further complicated by the fact that the nested items don't have to be simple declaration blocks. Some at-rules are allowed.

One approach might be to change DeclarationBlock so that it extends CSSBlockList rather than RuleSet, has a RuleSet as a property instead, and for backward-compatibility provides the RuleSet methods chaining on to the inner RuleSet property. That would deal with the latter problem (perhaps along with discarding at-rules which are not allowed to be nested), and also retain satisfaction for the CSSBlockList::getAllDeclarationBlocks method description: "Gets all DeclarationBlock objects recursively, no matter how deeply nested the selectors are." There's still the ordering problem to deal with. But if everything has a line number, the whole jumbled mess can be put through a sort function at render() time. I'm thinking this is the best way forward, but will sleep on it...

JakeQZ avatar Mar 17 '25 23:03 JakeQZ

Maybe we might put this on the back-burner and tackle this after a rewrite.

oliverklee avatar Mar 18 '25 08:03 oliverklee

Maybe we might put this on the back-burner and tackle this after a rewrite.

I'd prefer to get this in sooner, with more minimal refactoring, since it's already part of the spec, then tackle #1189 for 10.0.

I've added #1194 for the first part, so you can see what's invovled. This involves some very-slightly breaking changes which you can see from the changes to the tests, so it perhaps can't be backported (though users should perhaps first check that the RuleSets they get are actually instances of DeclarationBlock without assuming so, so maybe it can).

The slight downside of using delegation is that DeclarationBlock will need adapting to use Rule::parse() itself, though should be able to reuse CSSList::parseListItem() for the nested CSS.

Regarding the ordering issue, I think we can live with the edge case with the universal selector. It's the only one I can think of and not likely to occur in the real world. Reason being, is that after manipulation, the RuleSet may be extended to have line numbers higher than those of the nested items that may still be expected to be rendered after the rules. So I think we just render the rules first, then the nested items. This would provide a feature which is 99% complete with one known minor issue, which can be documented. I think that's better than not having the feature at all.

JakeQZ avatar Mar 19 '25 01:03 JakeQZ

  * Determining whether the next item is a rule or a nested declaration block is slightly tricky:
    
    * The next characters of interest that would determine which it is are `{` (for a declaration block) and `;` or `}` for a rule;
    * But these could be contained within a string in either situation (e.g. `content` property for a rule, or attribute selector), so proper parsing is required;
    * We could attempt to initially parse as a rule, but `a:hover` would pass unless we then checked for a `{` coming later;
    * Or we could attempt to initially parse as a declaration block, which should always fail if there's no `{`;
    * I think the latter seems more reliable;
    * Either way, I think it would involve catching an exception during normal parsing operation, which I'm not keen on, because exceptions should only be used for exceptional situations, but I can't think of a way of easily avoiding that.

We should look into the spec to see what we should do to distinguish properties from selectors. Rolling our own heuristic has never once worked out well.

sabberworm avatar Mar 19 '25 08:03 sabberworm

These are the specs explaining the algorithms: https://www.w3.org/TR/css-syntax-3/

oliverklee avatar Mar 19 '25 09:03 oliverklee

We should look into the spec to see what we should do to distinguish properties from selectors.

These are the specs explaining the algorithms: https://www.w3.org/TR/css-syntax-3/

I've picked out what seems to be relevant:

5.3.8. Parse a list of declarations

Note: Despite the name, this actually parses a mixed list of declarations and at-rules, as CSS 2.1 does for @page. Unexpected at-rules (which could be all of them, in a given context) are invalid and will be ignored by the consumer.

Note: This algorithm does not handle nested style rules. If your use requires that, use parse a style block’s contents.

The leads to

5.4.4. Consume a style block’s contents

Create an initially empty list of declarations decls, and an initially empty list of rules rules.

Repeatedly consume the next input token:

<whitespace-token> <semicolon-token>

  • Do nothing.

<EOF-token>

  • Extend decls with rules, then return decls.

<at-keyword-token>

<ident-token>

<delim-token> with a value of "&" (U+0026 AMPERSAND)

anything else

I think this is out-of-date, since nested style blocks no longer need to be prefaced with & (which would make things a whole lot easier). Now, an <ident-token> can also match the beginning of a selector (for any CSS property name for which there may be an HTML element with the same name, the details of which I don't think the parser should be concerned with). Attempting to parse rules like this could be done, but would need to be rewound upon encountering a {.

Also note that <EOF-token> is conceptual and seems would also include } at the end of the block.

There's also

5.3.1. Parse something according to a CSS grammar

It is often desirable to parse a string or token list to see if it matches some CSS grammar, and if it does, to destructure it according to the grammar.

This is along the lines I was originally suggesting: if it looks like a duck...

JakeQZ avatar Mar 20 '25 01:03 JakeQZ

<EOF-token>

  • Extend decls with rules, then return decls.

This doesn't seem to deal with the ordering issue either.

JakeQZ avatar Mar 20 '25 01:03 JakeQZ

Do you think it's something you can fix? Because we have a few of these cases and it breaks the rendering

ocoste avatar Apr 30 '25 11:04 ocoste

Do you think it's something you can fix? Because we have a few of these cases and it breaks the rendering

Hi @ocoste,

We will be able to implement a solution (and need to, because it's part of the spec), but it may take some time, as it is not a straightforward change. Some background work has been done. Next we have #1247 - hopefully we have agreement on the way to proceed there. This needs to be done before #1194, which will then be ready to go. The rest of the battle plan is outlined in the discussions above, but not completely decided upon.

If you have any ideas or would like to contribute in any way to help move this along, you'd be more than welcome. We're all contributing to this project in our spare time :)

JakeQZ avatar Apr 30 '25 23:04 JakeQZ