wordpress-develop
wordpress-develop copied to clipboard
HTML API: Reliably parse HTML in `wp_html_split()`.
Trac ticket: Core-63694 Replaces #6651 See: (#9270), #9850, #9851
Status
- [ ] This needs a new ticket for the 7.0 release.
- [ ] Some of the unit tests can and should be updated separately.
- [ ] Figure out why the test case is failing and fix it.
Design feedback
- Core has previously considered HTML like
<[[gallery]]>to be an escaped shortcode inside an HTML tag, but HTML considers it plaintext instead of a tag (because the starting character after the initial<is not a letter).- To match this behavior we can special-case text nodes which look like tags, but should we? This comes up in shortcode processing which decides not to replace shortcakes inside tags. So the ultimate question is: a. Is this actually a shortcode inside a tag to be ignored? b. Is this a shortcode inside a text node?
- HTML provides the second answer (b). WordPress’ answer is contextual.
- If it were
<[gallery]>and the[gallery]shortcode translated into a tag name then this entire thing would become a tag on replacement. - If it translated into a non-tag-name, however, the replacement would remain plaintext.
- If it were
Implementation
This probably improves the performance in terms of both CPU time and memory compared to the old PCRE-based approach.