freefeed-server icon indicating copy to clipboard operation
freefeed-server copied to clipboard

Do not count invisible characters in displaynames

Open indeyets opened this issue 6 years ago • 8 comments

We have a minimal number of characters in displayname checks, but the problem is, that it considers invisible characters as meaningful data too. It should not.

I suggest we remove all invisible characters from displayname before checking its length. This way, only visible characters would be considered and we would not end in situation when user specifies 3 invisible characters and gets an empty displayname as the result.

see https://stackoverflow.com/questions/11598786/how-to-replace-non-printable-unicode-characters-javascript for some inspiration.

indeyets avatar Apr 25 '19 10:04 indeyets

Does it mean filtering them completely or accepting but discounting?

abbra avatar Oct 01 '19 13:10 abbra

@abbra "accepting but discounting"

indeyets avatar Oct 01 '19 13:10 indeyets

Thanks. So, that means we can use a shorter table: http://jkorpela.fi/chars/spaces.html to filter out before supplying the string to countBreaks().

abbra avatar Oct 01 '19 14:10 abbra

No, this is a table of spaces, not of invisible characters.

davidmz avatar Oct 01 '19 15:10 davidmz

What you call 'invisible characters' are actually called 'whitespace characters' in Unicode. For example, https://en.wikipedia.org/wiki/Whitespace_character#Unicode lists 25 of those, in addition to 6 characters that have no WS property in the current standard but effectively represent a (potentially zero-width) white space, e.g. invisible.

Do you have anything on top of those 31?

abbra avatar Oct 01 '19 18:10 abbra

I made a mistake, I meant not invisible, but non-printable characters. Unfortunately, the issue description does not specify which characters are meant and what kind of username is wrong. But whitespace characters in screennames are perfectly acceptable (except for leading and trailing spaces), although we probably should not count zero-width spaces.

But besides zero-width spaces, there are also other non-printable characters, see https://en.wikipedia.org/wiki/C0_and_C1_control_codes and https://en.wikipedia.org/wiki/Unicode_control_characters

davidmz avatar Oct 01 '19 18:10 davidmz

Ok, thanks for the confirmation. I'll look into it.

abbra avatar Oct 01 '19 19:10 abbra

I'd like to point out that invisible symbols aren't the only thing. Words "ä̍̎̏" or, for example, "ёж" are also valid screennames.

davidmz avatar Oct 15 '19 18:10 davidmz