openmoji
openmoji copied to clipboard
FAQ: Better handeling of unicode value U+FE0F with Python+Javascript.
Don't merge yet, it's only a draft pull request. Attempt to solve #404. @Joshix-1 can you have a look at this? It's not yet a working solution, because sometimes this character is needed:
🦴 -> "1F9B4", can be found in https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F9B4.png 🐿️ -> "1F43F-FE0F" , can be found in https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F43F.png 👩⚕️ -> "1F469-200D-2695-FE0F" can be found in https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F469-200D-2695-FE0F.png
Currently, the last example would break, because the FE0F should not be removed.
I think a distinction of cases is needed here. Am I right in the assumption, that all emojis that have more than two of these character sequences separated by a "-" should not have removed the last FE0F ?
🏳️ would break too as it is saved as 1F3F3-FE0F https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F3F3-FE0F.png
If it doesn't really matter if it is a real emoji, wouldn't it be better to do it like GitHub and always remove the -FE0F?
If it gets only removed for the short sequences the following code should work for all emojis except the white flag:
emoji_code = "-".join(f"{ord(c):x}" for c in emoji).upper()
if len(emoji) == 2:
emoji_code = emoji_code.removesuffix("-FE0F")
let emojiCode = [...emoji].map(e => e.codePointAt(0).toString(16)).join(`-`).toUpperCase();
if (emoji.length === 2) emojiCode = emojiCode.replace("-FE0F", "");
Am I right in the assumption, that all emojis that have more than two of these character sequences separated by a "-" should not have removed the last FE0F ?
I am not sure. I think none should have it removed, but I'm not sure.
Another issue with the code I just noticed is, that it e.g. doesn't work with https://openmoji.org/library/emoji-0035-FE0F-20E3/ (The leading 0s are missing) Fix for python:
"-".join(f"{ord(c):04x}" for c in emoji).upper()
Fix for js:
[...emoji].map(e => e.codePointAt(0).toString(16).padStart(4, '0')).join(`-`).toUpperCase()
🏝 OpenMoji is on hold over summer (project maintainers are out of office until Oct 2022).
🏳️ would break too as it is saved as 1F3F3-FE0F https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F3F3-FE0F.png
For me, it does not break, I could also find this one: https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F3F3.png
Another issue with the code I just noticed is, that it e.g. doesn't work with https://openmoji.org/library/emoji-0035-FE0F-20E3/ (The leading 0s are missing)
Thanks for noting this! I've just written a new python script, that should now handle all cases properly, @Joshix-1, do you want to test this?
from PIL import Image
import requests
def get_emoji(emoji):
emoji_code = "-".join(f"{ord(c):04x}" for c in emoji).upper()
print(emoji_code)
if len(emoji) == 2:
emoji_code = emoji_code.removesuffix("-FE0F")
url = f"https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/{emoji_code}.png"
print(url)
im = Image.open(requests.get(url, stream=True).raw)
# image = np.array(im.convert("RGBA"))
return im
imgs = []
imgs += [get_emoji("🦴")] # Code: "1F9B4" > all good
imgs += [get_emoji("🐿️")] # Code: "1F43F-FE0F" > can be found under "1F43F" so "FE0F" has to be removed
imgs += [get_emoji("🏳️")] # Code: "1F3F3-FE0F" > can be found eighter under "1F3F3" or under "1F3F3-FE0F" so "FE0F" can be removed.
imgs += [get_emoji("5️⃣") ] # Problem with missing zero? > solved with 04x
imgs += [get_emoji("👩⚕️")] # Code: "1F469-200D-2695-FE0F" > Here, FE0F does not have to be removed.
#only for debugging:
import matplotlib.pyplot as plt
plt.figure(figsize=(20,10))
columns = 5
for i, image in enumerate(imgs):
plt.subplot(int(len(imgs) / columns + 1), columns, i + 1)
plt.imshow(image)
🏝 OpenMoji is on hold over summer (project maintainers are out of office until Oct 2022).
For me, it does not break, I could also find this one: https://raw.githubusercontent.com/hfg-gmuend/openmoji/master/color/72x72/1F3F3.png
Yes, true (I just looked for files ending with -FE0F and didn't check if it is there without) But it's weird that the flag has two representations and all the others don't.
I've just written a new python script, that should now handle all cases properly
Yes that looks better. I just don't like the += with the list. I think something like the following would be better
imgs = [
get_emoji("🦴"), # Code: "1F9B4" > all good
get_emoji("🐿️"), # Code: "1F43F-FE0F" > can be found under "1F43F" so "FE0F" has to be removed
get_emoji("🏳️"), # Code: "1F3F3-FE0F" > can be found eighter under "1F3F3" or under "1F3F3-FE0F" so "FE0F" can be removed.
get_emoji("5️⃣"), # Problem with missing zero? > solved with 04x
get_emoji("👩⚕️"), # Code: "1F469-200D-2695-FE0F" > Here, FE0F does not have to be removed.
]
Hi @kolibril13, Hope you've had a nice summer! Sorry for the ultra late reply! It this PR ready to merge? :)
Hi @b-g, the summer was great, hope for you as well :) I've just fixed the -FE0F issue in the javascript implementation, so the pr is ready to merge!
Hi @b-g, the summer was great, hope for you as well :) I've just fixed the -FE0F issue in the JavaScript implementation, so the pr is ready to merge!
Hi @kolibril13, Great! Many thanks! + Merged.