short-uuid icon indicating copy to clipboard operation
short-uuid copied to clipboard

Add support for UTF-8 symbols

Open ClassTerr opened this issue 2 years ago • 2 comments

Hello and thank you for such great library. I am using it to shorten uuids in URL.

But even with this library using flickr58 alphabet my UUID is 22 characters. It's better, but still not so short. So, I was experimenting with extending alphabet with additional characters. Since modern browsers are supporting emojis in URL I have tried set alphabet to a list of emojis, but had the next error: The provided Alphabet has duplicate characters resulting in unreliable results. The reason of this error is that UTF characters may occupy more than one character in a JS string:

'💚'.length === 2
new Set('💚').size === 1

So at least this condition is checking wrongly.

I am wondering if support of UTF characters will be implemented in future.

ClassTerr avatar Apr 05 '22 10:04 ClassTerr

This is definitely something that would be great to have, especially for making human readable values. Because of the complexity of how emoji split, we would need something like grapheme-splitter to resolve this consistently. That is a large library because it involves a large set of static values.

We could potentially accept alphabets as arrays, which could solve the initial creation, but we would fail to translate back without grapheme-splitter.

Because of the size and potential performance impact, I think we could implement this as a separate export to the library. This would allow developers to choose that support if desired.

oculus42 avatar May 16 '22 18:05 oculus42

IIUC, this proposal may be what you need. Still, this is not yet fully available. When available, it may solve the problem and use standard API (= no impact on library size)

balzdur avatar May 06 '24 16:05 balzdur