hedy
hedy copied to clipboard
[Language idea] Ignore accents when transpiling
When checking the input of an if, we match exactly the string given by the user, which is nice! But sometimes maybe we want to be more forgiving about what constitutes a match. One of these cases are accents: a character with an accent looks very similar to their ascii counterpart, requires you to press a different key before, and are (in the Spanish case) are not that used in day to day conversations, so it's easy to ignore them.
This could generate problems when checking an if, for example.
if ans is sí
print 'Awesome'
If the kids inputs si, but not sí, the if will not enter and possibly confuse the kid. This same logic can be applied to keywords and variables, we'd want jesus and jesús to be the same variable.
@Felienne has pointed out that we might not want to do this for every language or for every type of accent, because they're not necessarily equivalent in some languages, like French for example.
One posible way to deal with this, suggested by @Felienne and @TiBiBa is to create a mapper that maps chars with accents to their ascci equivalent, one downside is that this is very slow.
Apparently there is already a library for this! (Of course there is in Python...):
import unidecode
somestring = "àéêöhello"
#convert plain text to utf-8
u = unicode(somestring, "utf-8")
#convert utf-8 to normal text
print unidecode.unidecode(u)
Output:
aeeohello
Found the example here: https://stackoverflow.com/questions/44431730/how-to-replace-accented-characters#44433664
Ow wow that is a great find @TiBiBa!
We do however, still have the issue of comparisons on the front-end so we should implement a similar solution within TypeScript. Because we don't talk with the server after the code is transpiled to Python (correct me if I'm wrong!), the following code needs both the front-end and back-end to replace the characters:
animal = 'panda'
if animal is pandá print 'awesome!'
else print 'sad face'
Apparently there is already a library for this! (Of course there is in Python...):
import unidecode somestring = "àéêöhello" #convert plain text to utf-8 u = unicode(somestring, "utf-8") #convert utf-8 to normal text print unidecode.unidecode(u)
Output:
aeeohello
Found the example here: https://stackoverflow.com/questions/44431730/how-to-replace-accented-characters#44433664
Yes! I found this earlier and they mention some problems, but I haven't tested myself (https://stackoverflow.com/questions/517923/what-is-the-best-way-to-remove-accents-normalize-in-a-python-unicode-string)
We do however, still have the issue of comparisons on the front-end so we should implement a similar solution within TypeScript. Because we don't talk with the server after the code is transpiled to Python (correct me if I'm wrong!), the following code needs both the front-end and back-end to replace the characters:
animal = 'panda' if animal is pandá print 'awesome!' else print 'sad face'
Maybe we can do the same thing as with the numeric characters and include a function within the transpiled code something like:
input = normalize_accents(input)
to_check = normalize_accents(rhs_if)
if input == to_check:
And this one @boryanagoncharenko? Could be some fun language puzzling?