HTML API: Introduce CSS class-list splitter.

Open dmsnell opened this issue 3 months ago • 7 comments

Trac ticket: Core-63694.

This patch extracts WP_HTML_Tag_Processor->class_list() for static calls via a new static WP_HTML_Tag_Processor::parse_class_list() method. This new method contains the internal CSS parsing code to take an HTML class attribute value and return a Generator to iterate over the classes in that value. Class names are appropriately deduplicated according to the given document compatibility mode, whose default is no-quirks mode.

Design review requests

The name isn’t great. It iterates over the class names, splits them, but also “deduplicates” them according to the parsing rules for a classattribute _no-quirks-mode_ HTML and it decodes HTML entities. It also _MUST_ represent a fullclass attribute because the parsing of trailing character references which are missing a semicolon or otherwise incomplete is dependent on whether they fall at the end of a string.
- wp_classname_walker()
- wp_walk_class_attribute()
- wp_unique_classnames()
Should it be more useful to people wanting to conditionally add class names? Something more akin to classnames() in JS? We could pass varargs which are string|false or an array of additional class names to add.

function wp_classnames( 'wp-block-paragraph', array( 'display-wide' => $is_wide ) ) { … };

Update This patch has changed from introducing a new function to exposing an internal method on the Tag Processor. By making this change no new module needs to exist, and the method receives its own kind of helpful namespacing by nature of being a static method on the Tag Processor class.

Background

Many existing functions perform ad-hoc parsing of CSS class names, usually by splitting on a space character. However, there are issues with this approach:

There is no decoding of HTML character references, which is normative inside HTML attributes.
There is no handling of null bytes.
Class names can be split by more than just the space character.
There is no handling of duplicates, and while mostly benign, code forgetting to account for duplicates can lead to defects.

The new function handles the nuances to let developers focus on reading CSS class names, adding new class names, and removing class names. This serves a middleground between legacy code interacting with CSS class names in isolation and code processing full HTML documents.

Sep 25 '25 18:09 dmsnell