Phoenix
Phoenix copied to clipboard
RichTextCtrl incorrectly underlining and colouring text when re-writing it
Operating system Windows 10: wxPython version & source: 4.0.4, installed by PyCharm Python version & source: 3.7
Description of the problem: I am developing a control derived from RichTextCtrl which uses a spell-checker to identify spelling mistakes and highlight them by underlining them and showing them in red. The control also has to deal with URLs. When text is entered the control spell-checks words and uses the BeginUnderline/EndUnderline and BeginTextColour/EndTextColour methods to show wrongly spelled ones. It therefore is obliged to rewrite the control's text in its entirety, as it seems this is the only way to do it - it can't just 'mark' the existing text at specified positions.
The program initially uses the control to display "The wxRichTextCtrl ... generate an event." which it does properly. If the user then inserts text ("great" for example) into the control, the first letter (g) is correctly marked but when the "r" is entered the entire text apart from the URL is marked as wrongly spelled.
Code Example (click to expand)
# Put code sample here
import wx
import re
import wx.richtext as rt
from symspellpy import SymSpell, Verbosity
from symspellpy.suggest_item import SuggestItem
import pkg_resources
class _Match:
def __init__(self, match : re.Match, isURL, suggestions=None):
self.match = match
self.isURL = isURL
self.suggestions = suggestions
class RichTextControl(rt.RichTextCtrl):
# This is adapted from https://gist.github.com/gruber/8891611
_REGEXP = r"(?P<URL>(?i)\b((?:https?:(?:/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|" \
r"coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|" \
r"at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|" \
r"cm|cn|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|" \
r"gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|" \
r"je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|" \
r"ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|" \
r"pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|Ja|sk|sl|sm|sn|so|sr|ss|st|" \
r"su|sv|sx|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|" \
r"vn|vu|wf|ws|ye|yt|yu|za|zm|zw)/)(?:[^\s()<>{}\[\]]+|\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\))" \
r"+(?:\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\)|[^\s`!()\[\]{};:'" \
r'".,<>?«»“”‘’])|(?:(?<!@)[a-z0-9]+(?:[.\-][a-z0-9]+)*[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|' \
r'coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at' \
r'|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn' \
r'|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf' \
r'|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp' \
r'|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp' \
r'|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps' \
r'|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|Ja|sk|sl|sm|sn|so|sr|ss|st|su|sv|sx|sy|sz|tc|td' \
r'|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)\b/?(?!@))))' \
r'|(?P<WORD>\w+)'
def __init__(self, *args, **kwargs):
super(RichTextControl, self).__init__(*args, **kwargs)
self.Bind(wx.EVT_TEXT, self.onText)
self.regExp = re.compile(self._REGEXP)
self.speller = SymSpell()
path = pkg_resources.resource_filename("symspellpy", "frequency_dictionary_en_82_765.txt")
print('Loading dictionary ...')
self.speller.load_dictionary(path, 0, 1)
self.urlStyle = rt.RichTextAttr()
self.urlStyle.SetTextColour(wx.Colour(33, 94, 161))
def onText(self, evt : wx.CommandEvent):
self.ProcessText()
def ProcessText(self):
text = self.GetValue()
m : re.Match
matches = list(self.regExp.finditer(text))
workList = list()
for m in matches:
word = m.group('WORD')
if word is not None:
result = self.speller.lookup(word, Verbosity.TOP)
if result != word:
if len(result) == 1 and result[0].term.lower() == m.group().lower():
pass
elif len(result) > 0: # mistake with suggestions
workList.append(_Match(m, False, result))
else: # mistake, but no suggestions
workList.append(_Match(m, False, None))
else:
workList.append(_Match(m, True))
if len(workList) > 0:
self.ReplaceText(text, workList)
def ReplaceText(self, originalText, workList):
self.Freeze()
csrPos = self.GetCaretPosition()
self.Clear()
pos = 0
item : _Match
for item in workList:
self.WriteText(originalText[pos:item.match.start()]) # append text preceding each group
if item.isURL:
self.BeginStyle(self.urlStyle)
self.BeginURL(item.match.group())
self.WriteText(item.match.group())
self.EndStyle()
self.EndURL()
else:
self.BeginUnderline()
self.BeginTextColour(wx.Colour('red'))
self.WriteText(item.match.group())
self.EndTextColour()
self.EndUnderline()
pos = item.match.end()
# Append trailing text
self.WriteText(originalText[item.match.end():])
self.SetCaretPosition(csrPos)
self.Thaw()
pass
class RichTextFrame(wx.Frame):
def __init__(self, *args, **kw):
wx.Frame.__init__(self, *args, **kw)
self.rtc = RichTextControl(self, style=wx.VSCROLL | wx.HSCROLL | wx.NO_BORDER)
self.rtc.Bind(wx.EVT_TEXT_URL, self.OnURL)
urlStyle = rt.RichTextAttr()
urlStyle.SetTextColour(wx.Colour(33, 94, 161))
self.rtc.WriteText("The wxRichTextCtrl can also display URLs, such as this one: ")
self.rtc.BeginStyle(urlStyle)
self.rtc.BeginURL("http://www.wxwidgets.org")
self.rtc.WriteText("http://www.wxwidgets.org")
self.rtc.EndURL()
self.rtc.EndStyle()
self.rtc.WriteText(". Click on the URL to generate an event.")
def OnURL(self, evt):
wx.MessageBox(evt.GetString(), "URL Clicked")
class TestPanel(wx.Panel):
def __init__(self, parent):
wx.Panel.__init__(self, parent, -1)
win = RichTextFrame(self, -1, "Rich-text frame", size=(700, 500), style = wx.DEFAULT_FRAME_STYLE)
win.Show(True)
if __name__ == '__main__':
app = wx.App(0)
frame = wx.Frame(None)
panel = TestPanel(frame)
app.MainLoop()```
</details>

Can you please make a simpler reproducer (ie, a SCCCE)?
Also, can you try 4.2.0 - 4.0.4 is quite old.
Thanks for replying. I have tried to install the latest version as follows: I ran pip install wxPython==4.2.0 within PyCharm's Terminal window when in the project. This worked without errors. I then told PyCharm to remove wxPython 4.0.4 and then added 4.2, which was now visible as the latest version. Unfortunately this failed, being unable to file a module called attrDict (see screenshot). I can't find this file anywhere on my machine.

I'll try to think of a way to provide a simpler reproducer ....
Try pip install attrdict3 before installing wxPython 4.2.0.
I now get this error installing wxPython after installing attrdict3

I will remove code to do with URLs. Will that be minimal enough?
I started a new PyCharm project and wxPython 4.2 installed successfully in it. The problem is still there though.
Here is a simpler version which shows the problem. Start the program and then type "XX" (the only 'spelling mistake' the program recognises) anywhere in the string that appears. The XX is correctly shown in red and underlined. Type one more character (anything) and the entire line is wrongly shown in red and underlined.
`import wx import re import wx.richtext as rt
class RichTextControl(rt.RichTextCtrl): _REGEXP = r'(?P<WORD>\w+)'
def __init__(self, *args, **kwargs):
super(RichTextControl, self).__init__(*args, **kwargs)
self.Bind(wx.EVT_TEXT, self.onText)
self.regExp = re.compile(self._REGEXP)
self.urlStyle = rt.RichTextAttr()
self.urlStyle.SetTextColour(wx.Colour(33, 94, 161))
def onText(self, evt : wx.CommandEvent):
self.ProcessText()
def ProcessText(self):
text : str = self.GetValue()
m : re.Match
matches = list(self.regExp.finditer(text))
workList = list()
for m in matches:
word = m.group('WORD')
if word is not None:
if m.group() == 'XX':
workList.append(m)
if len(workList) > 0:
self.ReplaceText(text, workList)
def ReplaceText(self, originalText, workList):
self.Freeze()
csrPos = self.GetCaretPosition()
self.Clear()
pos = 0
item : re.Match
for item in workList:
self.WriteText(originalText[pos:item.start()]) # append text preceding each group
self.BeginUnderline()
self.BeginTextColour(wx.Colour('red'))
self.WriteText(item.group())
self.EndTextColour()
self.EndUnderline()
pos = item.end()
# Append trailing text
self.WriteText(originalText[item.end():])
self.SetCaretPosition(csrPos)
self.Thaw()
class RichTextFrame(wx.Frame): def init(self, *args, **kw): wx.Frame.init(self, *args, **kw) self.rtc = RichTextControl(self, style=wx.VSCROLL | wx.HSCROLL | wx.NO_BORDER) self.rtc.Bind(wx.EVT_TEXT_URL, self.OnURL) self.rtc.WriteText("The wxRichTextCtrl can also display URLs.")
def OnURL(self, evt):
wx.MessageBox(evt.GetString(), "URL Clicked")
class TestPanel(wx.Panel): def init(self, parent): wx.Panel.init(self, parent, -1) win = RichTextFrame(self, -1, "Rich-text frame", size=(700, 500), style = wx.DEFAULT_FRAME_STYLE) win.Show(True)
if name == 'main': app = wx.App(0) frame = wx.Frame(None) panel = TestPanel(frame) app.MainLoop()`
Instead of using BeginUnderline(), BeginTextColour() and re-entering the text etc, why not create a RichTextAttr to define the style and then use SetStyle() to apply the style to the appropriate range of characters in the control?
Fair point. But the control will eventually have to rewrite the text because that is (AFAIK) the only way to have it correct spelling mistakes it finds - because it can't replace a range of characters, I believe.
The Replace() method will do that.
Replace(self, from_, to_, value)
Replaces the content in the specified range with the string specified by value.
Parameters
from_ (long) –
to_ (long) –
value (string) –
Yes, I have just noticed it!! I will try your suggestion - thanks. However, I do think there is a bug: my orignal method should work too.
I tried to run your example code. I think I have fixed the formatting issues caused by pasting to this list.
However, I'm getting an error where it tries to compile the regex:
re.error: unknown extension ?P\w at position 1
Is the statement _REGEXP = r'(?P\w+)' correct?
No, I'm sorry. The code-quoting (ctrl+e) is screwing up things and I hadn't noticed. I've attached the code, which actually works! So all I have to do now is figure out why my original version (rejected as too complicated) does not! I'm attaching the correct code as I can't get it inline correctly. richTextURLdetection.zip
Thanks for uploading the code.
I think this is a feature of the RichTextCtrl. You can see it doing the same thing in the example in the wxPython demo. If you put the cursor at the end of any of the sections that has a style applied, the same style is applied to any new text you enter from there. This includes any spaces (on which the style is invisible).
What I do in my app is to clear all the styles whenever the text is changed and regenerate them based on the content.
Thanks for all your help. I'm working on using styles now ...
Happy to help. If you wish to discuss anything else about the RTC, or other wxPython topics I would recommend posting a question at https://discuss.wxpython.org/