openmoji icon indicating copy to clipboard operation
openmoji copied to clipboard

FAQ: Better handeling of unicode value U+FE0F with Python+Javascript.

Open kolibril13 opened this issue 3 years ago • 7 comments
trafficstars

Don't merge yet, it's only a draft pull request. Attempt to solve #404. @Joshix-1 can you have a look at this? It's not yet a working solution, because sometimes this character is needed:

🦴 -> "1F9B4", can be found in https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F9B4.png 🐿️ -> "1F43F-FE0F" , can be found in https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F43F.png 👩‍⚕️ -> "1F469-200D-2695-FE0F" can be found in https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F469-200D-2695-FE0F.png

Currently, the last example would break, because the FE0F should not be removed. I think a distinction of cases is needed here. Am I right in the assumption, that all emojis that have more than two of these character sequences separated by a "-" should not have removed the last FE0F ?

kolibril13 avatar Jul 17 '22 13:07 kolibril13

🏳️ would break too as it is saved as 1F3F3-FE0F https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F3F3-FE0F.png

If it doesn't really matter if it is a real emoji, wouldn't it be better to do it like GitHub and always remove the -FE0F?

If it gets only removed for the short sequences the following code should work for all emojis except the white flag:

emoji_code = "-".join(f"{ord(c):x}" for c in emoji).upper()
if len(emoji) == 2:
    emoji_code = emoji_code.removesuffix("-FE0F")
let emojiCode = [...emoji].map(e => e.codePointAt(0).toString(16)).join(`-`).toUpperCase();
if (emoji.length === 2) emojiCode = emojiCode.replace("-FE0F", "");

Am I right in the assumption, that all emojis that have more than two of these character sequences separated by a "-" should not have removed the last FE0F ?

I am not sure. I think none should have it removed, but I'm not sure.

Joshix-1 avatar Jul 17 '22 17:07 Joshix-1

Another issue with the code I just noticed is, that it e.g. doesn't work with https://openmoji.org/library/emoji-0035-FE0F-20E3/ (The leading 0s are missing) Fix for python:

"-".join(f"{ord(c):04x}" for c in emoji).upper()

Fix for js:

[...emoji].map(e => e.codePointAt(0).toString(16).padStart(4, '0')).join(`-`).toUpperCase()

Joshix-1 avatar Jul 28 '22 22:07 Joshix-1

🏝 OpenMoji is on hold over summer (project maintainers are out of office until Oct 2022).

github-actions[bot] avatar Jul 28 '22 22:07 github-actions[bot]

🏳️ would break too as it is saved as 1F3F3-FE0F https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F3F3-FE0F.png

For me, it does not break, I could also find this one: https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F3F3.png

Another issue with the code I just noticed is, that it e.g. doesn't work with https://openmoji.org/library/emoji-0035-FE0F-20E3/ (The leading 0s are missing)

Thanks for noting this! I've just written a new python script, that should now handle all cases properly, @Joshix-1, do you want to test this?

from PIL import Image
import requests

def get_emoji(emoji):
    emoji_code = "-".join(f"{ord(c):04x}" for c in emoji).upper()
    print(emoji_code)
    if len(emoji) == 2:
        emoji_code = emoji_code.removesuffix("-FE0F")
    url = f"https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/{emoji_code}.png"
    print(url)
    im = Image.open(requests.get(url, stream=True).raw)
   # image = np.array(im.convert("RGBA")) 
    return im
imgs = []
imgs += [get_emoji("🦴")] # Code: "1F9B4" > all good
imgs += [get_emoji("🐿️")] #  Code: "1F43F-FE0F" > can be found under "1F43F" so "FE0F" has to be removed
imgs += [get_emoji("🏳️")] #  Code: "1F3F3-FE0F" > can be found eighter under "1F3F3" or under "1F3F3-FE0F"  so  "FE0F" can be removed.
imgs += [get_emoji("5️⃣") ] # Problem with missing zero?  > solved with 04x
imgs += [get_emoji("👩‍⚕️")] #  Code: "1F469-200D-2695-FE0F" > Here, FE0F does not have to be removed.

#only for debugging:
import matplotlib.pyplot as plt

plt.figure(figsize=(20,10))
columns = 5
for i, image in enumerate(imgs):
    plt.subplot(int(len(imgs) / columns + 1), columns, i + 1)
    plt.imshow(image)
image

kolibril13 avatar Jul 29 '22 08:07 kolibril13

🏝 OpenMoji is on hold over summer (project maintainers are out of office until Oct 2022).

github-actions[bot] avatar Jul 29 '22 08:07 github-actions[bot]

For me, it does not break, I could also find this one: https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F3F3.png

Yes, true (I just looked for files ending with -FE0F and didn't check if it is there without) But it's weird that the flag has two representations and all the others don't.

I've just written a new python script, that should now handle all cases properly

Yes that looks better. I just don't like the += with the list. I think something like the following would be better

imgs = [
    get_emoji("🦴"),  # Code: "1F9B4" > all good
    get_emoji("🐿️"),  # Code: "1F43F-FE0F" > can be found under "1F43F" so "FE0F" has to be removed
    get_emoji("🏳️"),  # Code: "1F3F3-FE0F" > can be found eighter under "1F3F3" or under "1F3F3-FE0F"  so  "FE0F" can be removed.
    get_emoji("5️⃣"),  # Problem with missing zero?  > solved with 04x
    get_emoji("👩‍⚕️"),  # Code: "1F469-200D-2695-FE0F" > Here, FE0F does not have to be removed.
]

Joshix-1 avatar Jul 29 '22 14:07 Joshix-1

Hi @kolibril13, Hope you've had a nice summer! Sorry for the ultra late reply! It this PR ready to merge? :)

b-g avatar Nov 04 '22 15:11 b-g

Hi @b-g, the summer was great, hope for you as well :) I've just fixed the -FE0F issue in the javascript implementation, so the pr is ready to merge!

kolibril13 avatar Nov 05 '22 14:11 kolibril13

Hi @b-g, the summer was great, hope for you as well :) I've just fixed the -FE0F issue in the JavaScript implementation, so the pr is ready to merge!

kolibril13 avatar Nov 05 '22 14:11 kolibril13

Hi @kolibril13, Great! Many thanks! + Merged.

b-g avatar Nov 05 '22 15:11 b-g