cssselect icon indicating copy to clipboard operation
cssselect copied to clipboard

[feature-request] `:not()` to support generic selectors (not only "simple" ones)

Open starrify opened this issue 9 years ago • 2 comments

The document (version 0.9.1) says:

:not() accepts a sequence of simple selectors, not just single simple selector. For example, :not(a.important[rel]) is allowed, even though the negation contains 3 simple selectors.

May I ask what is a simple selector? Can :not() support something like :not(a>b)?

>>> import cssselect
>>> cssselect.parse('a:not(p>a)')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.4/site-packages/cssselect/parser.py", line 355, in parse
    return list(parse_selector_group(stream))
  File "/usr/lib/python3.4/site-packages/cssselect/parser.py", line 370, in parse_selector_group
    yield Selector(*parse_selector(stream))
  File "/usr/lib/python3.4/site-packages/cssselect/parser.py", line 378, in parse_selector
    result, pseudo_element = parse_simple_selector(stream)
  File "/usr/lib/python3.4/site-packages/cssselect/parser.py", line 471, in parse_simple_selector
    raise SelectorSyntaxError("Expected ')', got %s" % (next,))
cssselect.parser.SelectorSyntaxError: Expected ')', got <DELIM '>' at 7>

starrify avatar Aug 24 '15 00:08 starrify

Simple selector is defined at https://drafts.csswg.org/selectors-3/#simple-selectors-dfn . It does not include a > b.

This restriction of :not() is as defined in the Selectors Level 3 specification. This matches what most browsers implement in CSS today.

Selectors Level 4 lifts that restriction so it would be fine to implement in cssselect. However, I don’t know how translating this into XPath would work. If you figure it out, I’d review a pull request.

SimonSapin avatar Aug 24 '15 09:08 SimonSapin

As :not() takes a selector list and not just two simple selectors with a combinator, it looks like for the general support this needs some kind of reversing the selection, which XPath 1.0 doesn't have. So it looks like the only way to support a certain expression is to translate it manually into XPath and then implement that translation. So #124 should add support for things like :not(a > b), mentioned in this issue, but if one wants more complicated things (e.g. :not(a > b > c, d > e)) it will need additional code, specific for some of these things. It's also not very intuitive at least for me, because you need to also invert the order: while a > b translates to //a/b, :not(a > b) translates to something like //b[..[name() != 'a']] and I'm not sure how complicated it would be for more complicated selectors or even selector lists inside :not(), and I don't know which complex use cases are actually useful to also implement them. So for the scope of #124 I think we need something like this:

  • :not(a > b) - select anything that is not (b with a parent a) ≡ select anything that is not b or doesn't have a parent a, so //*[not([a] and parent::*[b])]
  • :not(a b) - select anything that is not (b with an ancestor a) ≡ select anything that is not b or doesn't have a ancestor a, so //*[not([a] and ancestor::*[b])]
  • :not(a + b) - select anything that is not (b with an immediate sibling a) ≡ select anything that is not b or doesn't have an immediate sibling a, so //*[not([a] and following-sibling::*[position()=1 and b])]
  • :not(a ~ b) - select anything that is not (b with a sibling a) ≡ select anything that is not b or doesn't have a sibling a, so //*[not([a] and following-sibling::*[b])]

where a and b are whatever we already support as conditions, I think. Disclaimer: I may be wrong with these expressions.

wRAR avatar Aug 04 '21 17:08 wRAR