Constrain the ement-room--event-mentions-user-p regexp
At present the word uphill in a message is treated as mentioning the username phil :)
Let's at minimum wrap the pattern with word boundaries?
(Or do we then worry about usernames which do not begin/end with word characters? Either way, I'm sure we can improve this...)
Makes sense. Thanks.
I looked to see if there were restrictions on characters used for a user's display name, and I'm not seeing anything. Searching for displayname at https://spec.matrix.org/v1.8/client-server-api/ gives lots of info, but it seems like the value may be "any string".
So we can't simply wrap the quoted display name with word boundaries, but I suspect that even if the user had non-word characters at the start or end of their display name, we could assume that any mention of their name will have non-word characters (or bol/eol) surrounding the display name. So maybe something like this:
"\\(?:^\\|[^[:word:]]\\)\\(DISPLAYNAME\\)\\(?:$\\|[^[:word:]]\\)"
Possibly in the form of a rx-based template in a new variable or user option.
It's kind of hidden here: https://spec.matrix.org/v1.8/appendices/#user-identifiers
Ah, yes, we have different rules for username and displayname (I was only thinking about displayname).
At present ement-room--event-mentions-user-p uses the same code for both. In fact it matches on three names...
(or (matches-body-p (ement-user-username user))
(matches-body-p (ement--user-displayname-in room user))
(matches-body-p (ement-user-id user)))))))
Where username is the "Username part of user's Matrix ID." in the user struct; so I guess we apply the character constraints from your spec URL to both ement-user-username and ement-user-id, and only ement--user-displayname-in should be the more permissive one.
IIUC, doing this would fix the issue for both username and useid, right?
(or (matches-body-p (format "\\b%s\\b" (ement-user-username user)))
(matches-body-p (ement--user-displayname-in room user))
(matches-body-p (format "\\b%s\\b" (ement-user-id user))))))))
Can we do this first and deal with the user displayname after?
About the displayname, it sounds like there is no silver bullet.
For example, I think the displayname ":-) me :-)" can be considered a mention in "Hello :-) me :-)" but also in "Hello:-) me :-)".
But the displayname ". Therefore" would be matched against "Bla bla. Therefore bla bla", which seems counterintuitive.
I guess that to deal with all cases, we should provide a custom fonction that the user may define to match per own displayname, and provide one by default. My opinion would be to provide one that keep the current behavior.