textwrap: don't split words at punctuation by default
The default textwrap::WordSeparator::UnicodeBreakProperties provides sensible line breaking for (e.g.) emojis and CJK text. Unfortunately, it also considers punctuation like / to be an appropriate location for line breaks. This is fine for normal text, but leads to very bad behavior when attempting to wrap error messages. Here, a file path is broken across multiple lines (with box drawing characters added in between parts of the path as well), making it impossible to copy-paste the path out of the error message:
Error: × Failed to read Buck2 event log from `buck2 build //aaa/aaaa` via /var/folders/z5/fclwwdms3r1gq4k4p3pkvvc00000gn/
│ T/.tmpBgvlUI/buck-log.jsonl.gz
╰─▶ failed to open file `/var/folders/z5/fclwwdms3r1gq4k4p3pkvvc00000gn/T/.tmpBgvlUI/buck-log.jsonl.gz`: No such
file or directory (os error 2)
In the future, we may want to write our own line break algorithm that breaks between CJK codepoints and emojis but not at punctuation like slashes. For now, I believe it will be better to break lines at ASCII spaces only.
Similar changes are made for some other settings:
- The default for
break_wordshas been changed tofalse. - The default
textwrap::WordSplitterhas been changed to not split words at existing hyphens, to prevent splits like--foo-barinto--foo-andbar.
I'm hesitant to change the default settings like this, but I think it's important that identifiers, filenames, URLs, and CLI options printed in error messages remain unbroken and copy-pastable by default.