ocaml-emoji
ocaml-emoji copied to clipboard
New release on opam
I noted you switched to Dune, it makes sense to upload a new release to the opam.
@zapashcanon
Hi, I strongly recommend releasing a new version after #7 (and maybe #9) (with a bump on the major version number because many names have been changed). Thanks!
With @Swrup we improved a lot of things, see his fork. We would like to fix a few more things then we'll open a PR in this repository. Once it's done, I can take care of publishing it on opam.
@zapashcanon @Swrup The fork looks exciting with all the (sub)categories! I see that we all noticed some of the same issues (e.g. diacritics) but fixed them slightly differently, and in some cases I love your solution (e.g., parsing the official data files), and in some cases I like mine better (e.g., using "first" instead of "_1st"). How should I join the party? Should I wait for the PR?
Hi @favonia, sorry I should have made a PR earlier.
The thing is we wanted to only give the gencode and not the generated emoji.ml But to get categories we need to read the html row by row, this make it extremely slow. It takes 2min to generate the emoji.ml ...
Also I would like to have all pictographs (maybe in another lib). See "Emoji & Pictographs" and "Other Symbols" in https://www.unicode.org/charts/ . To do that I think I will have to use the xml file with all characters.
@Swrup What's missing in your fork now? I am interested in helping if it only needs some easy fix. I do think in the long run we should parse the data files instead of scraping the web pages, and uucd is a good library. (Not sure about uucp because emojis are spread across so many blocks.) Anyways, looking forward to your PR.
@zapashcanon @Swrup Any progress on this?
Sorry, I don't remember what was left to do. I think @Swrup had something working and it just need to be published on opam.
Hello, Sorry I forgot about it.
My fork works with the unicode v15 published in September. it has a total of 3664 emojis!
The code to generate emoji.ml parses files from unicode.org: http://www.unicode.org/emoji/charts/emoji-list.html and https://www.unicode.org/emoji/charts/full-emoji-modifiers.html It has all emojis according to https://www.unicode.org/emoji/charts/emoji-counts.html It add diatrics fixes. (I think staying closer to the official name by using '_' is and staying consistent is better) It add (sub)categories!
but it conflict with the lasts commits and also emoji.ml is still not generated at compilation (because It's too slow) and some dislike this