selectolax icon indicating copy to clipboard operation
selectolax copied to clipboard

List of selectors throwing "Bad CSS Selectors".

Open FanaticPythoner opened this issue 2 years ago • 6 comments

Here is a list of CSS selectors which should work, but instead throw "Bad CSS Selectors". I'm using selectolax.parser.HTMLParser with no other parameter than the HTML to parse, as well as the HTMLParser.css() function.

.comment-list>li .comment-list>li ol.children .comment-list>li ol.children li .comment-list>li ol.children li:last-child .comment-list>li:not(:last-child) .comment-list>ol .figure-post>*:first-child .figure-post>*:last-child .figure-post__media>a .figure-post__media>a img .figure-post__media>a img .figure-post__content>*:first-child .figure-post__content>*:last-child [data-arts-theme-text=light] .widget_nav_menu ul.menu>li a, .arts-elementor-theme-light .widget_nav_menu ul.menu>li a .header_sticky.bg-dark-1, .header_sticky.bg-dark-2, .header_sticky.bg-dark-3, .header_sticky.bg-dark-4, .header_sticky .menu>li>a .input-float__input_focused+.input-float__label, .input-float__input_not-empty+.input-float__label .input-float__input_focused+.input-float__label .trp-language-switcher>div>a .menu>li .menu>li:not(:last-child) .menu>li a .menu .menu-item-has-children>a~ul .menu .sub-menu>li .menu .sub-menu>li a .menu .sub-menu>li a .menu-overlay>li .menu-overlay>li>a .menu-overlay .sub-menu>li .menu-overlay .sub-menu>li>a .modal-footer>:not(:first-child) .modal-footer>:not(:last-child) .post__content>*:first-child, .post__comments>*:first-child, .section-content__heading>*:first-child, .section-content__text>*:first-child .post__content>*:last-child, .post__comments>*:last-child, .section-content__heading>*:last-child, .section-content__text>*:last-child .post__content ul:not(.wp-block-gallery) li>span, .post__comments ul:not(.wp-block-gallery) li>span, .section-content__heading ul:not(.wp-block-gallery) li>span, .section-content__text ul:not(.wp-block-gallery) li>span .post__content ol:not(.comment-list) li>span, .post__comments ol:not(.comment-list) li>span, .section-content__heading ol:not(.comment-list) li>span, .section-content__text ol:not(.comment-list) li>span .post__content>ul, .comment-content>ul, .section-content__heading>ul, .section-content__text>ul .section-masthead[data-arts-os-animation]:not([data-arts-os-animation=animated])>* .section-masthead__meta-item>* [data-arts-theme-text=light]:not([data-arts-header-overlay-theme-text=dark]) .split-text:not(.js-split-text) .has-drop-cap>div:first-child, .arts-elementor-theme-light .split-text:not(.js-split-text) .has-drop-cap>div:first-child [data-arts-theme-text=light]:not([data-arts-header-overlay-theme-text=dark]) .input-float__input_focused+.input-float__label, .arts-elementor-theme-light .input-float__input_focused+.input-float__label .split-text:not(.js-split-text) .has-drop-cap>div:first-child .split-text:not(.js-split-text) .has-drop-cap>div:first-child:after .pt-small.offset_bottom .section-offset__content, .pt-small.offset_bottom>.elementor-container .pt-medium.offset_bottom .section-offset__content, .pt-medium.offset_bottom>.elementor-container .pt-large.offset_bottom .section-offset__content, .pt-large.offset_bottom>.elementor-container .pb-small.offset_top .section-offset__content, .pb-small.offset_top>.elementor-container .pb-medium.offset_top .section-offset__content, .pb-medium.offset_top>.elementor-container .pb-large.offset_top .section-offset__content, .pb-large.offset_top>.elementor-container .widget_nav_menu ul.menu>li .widget_nav_menu ul.menu>li a .widget_nav_menu ul.menu>li a:after, .widget_nav_menu ul.menu>li a:before .widget_nav_menu ul.menu>li a .widget_nav_menu ul.menu>li.menu-item-has-children .widget_nav_menu ul.menu>li.menu-item-has-children a:after .widget_nav_menu ul.sub-menu>li .widget_nav_menu ul.sub-menu>li>a .widget_nav_menu ul.sub-menu>li>a .widget_rss ul>li .widget_rss ul>li:last-child .widget_icl_lang_sel_widget .wpml-ls-legacy-dropdown a, .widget_icl_lang_sel_widget .wpml-ls-legacy-dropdown a, .widget_icl_lang_sel_widget .wpml-ls-legacy-dropdown .wpml-ls-current-language>a .widget_text .textwidget>p

FanaticPythoner avatar Dec 30 '22 16:12 FanaticPythoner

Please use spaces between > as recommended in CSS specification.

rushter avatar Dec 30 '22 18:12 rushter

@FanaticPythoner

Please see white-space in selectors. But, in the last lexbor version "combinators" normal work without whitespaces.

lexborisov avatar Dec 30 '22 19:12 lexborisov

I understand that. It is still, however, my strong belief that these cases should still be handled by the library, as it is far from everyone that follows best practices. Moreover, people, such as me, obtaining those selectors via parsing the CSS files in a given web page don't have control over whether or not the person who made the CSS declarations in said CSS files followed good practices.

FanaticPythoner avatar Dec 30 '22 19:12 FanaticPythoner

@rushter @lexborisov

FanaticPythoner avatar Dec 30 '22 19:12 FanaticPythoner

In the meantime, for those reading this thread, you can simply do something like that in your code prior to calling HTMLParser.css():

chars_to_patch = [
    '>',
    '+',
    '~'
]
                    
for c in chars_to_patch:
    c_spaces = ' ' + c + ' '
    if c in selector_str and c_spaces not in selector_str:
        selector_str = selector_str.replace(c, c_spaces).replace('  ', ' ')

FanaticPythoner avatar Dec 31 '22 03:12 FanaticPythoner

I understand that. It is still, however, my strong belief that these cases should still be handled by the library, as it is far from everyone that follows best practices. Moreover, people, such as me, obtaining those selectors via parsing the CSS files in a given web page don't have control over whether or not the person who made the CSS declarations in said CSS files followed good practices.

I can't fix it on my side. Adding whitespaces by using similar approach to yours is risky, so we need this to be fixed in the Modest engine. Is there anything that stops you from using the lexbor backend instead? It's an improved version of the parser, but it's not 100% compatible with the modest.

rushter avatar Jan 06 '23 12:01 rushter