simpletoc icon indicating copy to clipboard operation
simpletoc copied to clipboard

enhance simpletoc_sanitize_string function

Open bwmatter opened this issue 7 months ago • 1 comments

Added in extra validation in the simpletoc_sanitize_string() function to ensure the generated ID attribute validates (i.e. begins with a letter.)

bwmatter avatar May 31 '25 19:05 bwmatter

The intention is good (ensure IDs don’t start with a digit), but the current patch has a few bugs and will over-prefix/over-encode. I’d request changes before merging.

Issues in the PR

Regex is wrong for “begins with a letter”. "/^[_a-zA-Z]+$/" matches the entire string and forbids digits/dashes entirely. You only want to check the first character.

Double encoding. $urlencoded = urlencode($sanitized_string); is executed in both branches and again after the if → redundant / risk of future double-encode changes.

Encoding policy. Encoding here is questionable: sanitize_title_with_dashes() already yields URL/ID-safe slugs. Returning an encoded value can lead to later double encoding when appended to URLs.

Empty slug case. If a heading becomes empty after sanitization (e.g., only emojis), there’s no fallback.

mtoensing avatar Aug 16 '25 22:08 mtoensing