pxt icon indicating copy to clipboard operation
pxt copied to clipboard

compile / decompile emojis

Open jwunderl opened this issue 4 years ago • 5 comments

I like using emojis for examples sometimes but it gets bad when you compile (just some combination of _ and numbers), so this would let allow emojis to persist between compile -> decompile:

shiny shinies

Obviously the generated names aren't great (they're just "E" + emoji's codepoint in base 16 + "X", maybe could use $ to surround instead as that's also valid / uncommonly used anyways) but I don't know it's any worse than just _ / _2 / etc; starting as a draft though as I don't know if we'd want to take it necessarily for this reason

test build with a sample project: https://arcade.makecode.com/app/0610dda95754bed7e6df6ce7485f7d128dd0f4ea-1375337f35#pub:_XLqH792iMbks

Most of the diff is from https://github.com/microsoft/pxt/pull/7861/files#diff-b3b92c2632ab9b3508e5ea8cd2f72f4d5829fbfb8ef35ab61303cc85d61b93aeR30, which I needed to change here as it produces invalid encodings for most emojis (unicode 'astral plane' codepoints) / breaks anything above 0xffff; as far as I can tell html / xml with utf8 encoding should only ever need to encode <>"'& special anyway though so this seems fine / better anyways? e.g. https://www.w3.org/International/questions/qa-escapes#use & https://hsivonen.fi/producing-xml/#noescaping

I messed up switching between branches after my local version of the branch broke and accidentally pulled when I should have checkout'd, so I accidentally merged this into master / reverted it already D:. The commits from when I wrote it are in https://github.com/microsoft/pxt/compare/8e3181e...bb4281a, and I reverted that in https://github.com/microsoft/pxt/commit/596e90c79336f81c1a086ec05149ec8bdc544ff5, so this pr would... revert that.

edit added support for debugger view as well, and made that show unescaped names (e.g. a variable text list currently shows as text_list in arcade/live, but with this shows without the space escaped):

image

also fixes https://github.com/microsoft/pxt-arcade/issues/2774

jwunderl avatar Feb 04 '21 06:02 jwunderl

i wonder if we could do a mapping to the unicode names instead? Might look a bit nicer (but would definitely be more complicated).

riknoll avatar Feb 17 '21 23:02 riknoll

For using the unicode names, I believe the values should be the same in general -- that is, it's currently encoding U+1f4da as E1f4daX in the name just to make it easier to escape / not run into a case where it tries to swap a users actual intended variable name with an emoji that happens to match. I can make it U1f4da$ or something instead (just want to keep a marker on each side so it's definite which things are part of emoji and which are just letters that are valid hexchars) -- only needs changing like this: https://github.com/microsoft/pxt/pull/7861/commits/db107f0cc42c068a613aee5c8d3b0b8688586ea4

jwunderl avatar Feb 18 '21 01:02 jwunderl

@jwunderl I meant the name of the emoji. e.g. "grinningFace" and "personInSuitLevitating"

riknoll avatar Feb 18 '21 01:02 riknoll

Okay finally brought this up to date, I'm gonna try and get this in for next arcade release :) here's a test build, everything appears to still be working but I'll poke around a bit more

https://arcade.makecode.com/app/ef36c7fc259238e12d680c626d4af6c6bbf1b771-418c08602e#pub:_Rd0UJmiP0RWq

if we wanted to use the english name for the emoji we'd need to do something like fetch it from https://github.com/unicode-org/cldr which is ... a bit much.

We could also try base-36'ing it, which benefit to that is that it appears to be pretty consistently 3-4 chars per id instead of 4-6 then~

jwunderl avatar Jul 12 '22 20:07 jwunderl

Is this PR still valid?

abchatra avatar Oct 18 '22 03:10 abchatra