wx.lib.wordwrap.wordwrap() raises error on multi-byte Unicode
Operating system: Windows wxPython version & source: 4.1.1 msw (phoenix) wxWidgets 3.1.5 (pip-installed) Python version & source: stock 3.8.10
Description of the problem:
When given a string containing multi-byte Unicode characters like 😐 ("neutral face", chr(0x1F610)), wx.lib.wordwrap.wordwrap() raises IndexError.
Example:
import wx, wx.lib.wordwrap
app = wx.App()
wx.lib.wordwrap.wordwrap(chr(0x1F610), 100, wx.MemoryDC())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Program Files\Python3\lib\site-packages\wx\lib\wordwrap.py", line 36, in wordwrap
if line[idx] == ' ':
IndexError: string index out of range
The problem is that dc.GetPartialTextExtents() returns multiple lengths for a single character:
>>> wx.MemoryDC().GetPartialTextExtents(chr(0x1F610))
[5, 11]
One workaround would be using GetTextExtent() instead.
In https://github.com/wxWidgets/Phoenix/blob/master/wx/lib/wordwrap.py#L27, replacing
pte = dc.GetPartialTextExtents(line)
with this instead:
pte = []
for c in line:
pte.append(dc.GetTextExtent(c).width + (pte[-1] if pte else 0))
I can make a pull request if this change would be acceptable.
(Sidenote: in Python2 GetPartialTextExtents() returned similarly multiple lengths, but since in Python2 strings the multi-byte characters actually got counted as multiple characters, this problem did not arise.)