wordpress-develop
wordpress-develop copied to clipboard
Formatting: Strip invalid XML characters in `esc_xml()`
Trac ticket: https://core.trac.wordpress.org/ticket/19998
This PR enhances the esc_xml() function to strip control characters that are not valid according to the XML 1.0 specification. This prevents feed parsers from breaking when encountering invalid characters like vertical tabs, null bytes, and other unprintable control characters in user-supplied content.
- Modified
esc_xml()insrc/wp-includes/formatting.phpto strip invalid XML characters using regex pattern that matches XML 1.0 spec - Character stripping only applies when
blog_charsetis UTF-8 to avoid encoding issues - Preserves valid control characters (tab
\x09, line feed\x0A, carriage return\x0D) - Removes invalid characters (null bytes, vertical tabs, file separators, and other unprintable characters)
- Added comprehensive unit tests covering various scenarios