julia
julia copied to clipboard
Define textwidth for overlong chars
Previously, this would error. There is no guarantee of how terminals render overlong encodings. Some terminals does not print them at all, and some print "�". Here, we set a textwidth of 1, conservatively.
Is this really the correct fix? On 1.11, the reproducer in #58593 was printed like
julia> s = "\xc0\xa0\xe2\x88\xe2|"
"\xc0\xa0\xe2\x88\xe2|"
which has one overlong encoded character (\xc0\xa0) at the front, taking up much more than 1 block of width. If we're willing to overestimate on the suspicion that the terminal can't print it, can't we do a safe fallback to the number of bytes needed to encode this times 4, assuming we'd continue to print this string like above? Might need an update to the docstring too, to mention this case for unusual Strings.
Note that what you're showing is how the method is displayed with Base.show, not print. Textwidth is about print.
I'm planning on making another PR to introduce Base._show_textwidth that gives the width of a char when shown, as opposed to printed. Then that new function will be used in the String display code. But I consider that a separate issue from this bugfix
That does make sense, but the reproducer from the issue still hits this error from show, no? If textwidth is for print, then that error shouldn't happen in the first place.
Conservatively defining textwidth for overlong characters to be 1 does make sense in the context of print though.
Yes, in that sense, this PR is a bandaid. However, IMO, it's a worthwhile bandaid, especially since it's likely no-one is going to rely on the textwidth of show. In essence, the only effect of the true bugfix will be to display strings in the REPL more consistently. Which is nice, but not important.
I don't think this should necessarily include the text "Close #58593", since it is a good fix, but the cause of #58593 seems to be more nearly that we call textwidth(str) instead of textwidth(repr(str)) ?