problem-specifications icon indicating copy to clipboard operation
problem-specifications copied to clipboard

Add tests to reverse-string to teach correct string handling

Open tobega opened this issue 4 years ago • 23 comments

Currently the tests for reverse string only contain strings with 7-bit ASCII. This is teaching people the wrong way to handle strings as you can just treat it as an array of bytes. Incorrect handling of non-ASCII characters (or surrogate pairs if you use something like Java) is one of the biggest sources of bugs in the industry.

I propose adding at least a test with a multibyte UTF-8 character as is common in most European languages, e.g. "skåp" which becomes "påks" (this would make a difference in the Julia solutions, for example)

For even better benefit (for UTF-16 languages), it could be good to have a string containing a surrogate pair, e.g. "\uD834\uDD1E", a G-clef, which should come out as it was.

That is probably enough, and it might be too difficult to also handle combining marks, e.g. "as⃝df̅" should become "f̅ds⃝a", not "̅fd⃝sa" (from rosettacode.org)

tobega avatar May 26 '20 13:05 tobega