typeid icon indicating copy to clipboard operation
typeid copied to clipboard

More compact string encoding

Open justin-yan opened this issue 2 years ago • 1 comments

Cool project! I've also recently been thinking about typeIDs (I follow a similar pattern in some of my toy projects) and might be interested in collaborating on a python implementation.

The approach I've been taking is to run a UUIDv7 through base58 (https://pypi.org/project/base58/) before prefixing in order to get an even shorter string encoding. I haven't done this at any particular scale, but I'd be curious if you considered an encoding like this, and if there are any pros/cons you see either way?

When I was looking for an encoding scheme, I had a similar set of requirements:

URL safe, case-insensitive, avoids ambiguous characters, can be selected for copy-pasting by double-clicking, and is a more compact encoding than the traditional hex encoding used by UUIDs

justin-yan avatar Jun 28 '23 20:06 justin-yan

I actually considered using base64url as the encoding, but unlike base32 that encoding is case-sensitive. Because this is written as a general purpose library, I didn't want to pre-suppose the use case for the ID. For example, if you are using the ID to name a file in the file system, there are some case-insensitive filesystems for which base64 would not be applicable.

If you have a use case where you know that you can support case sensitivity throughout, then I think base58 and base64url are great encodings (with the benefit that they make the string even shorter)

loreto avatar Jun 28 '23 23:06 loreto

Closing, since for the general purpose typeid we'll leave the encoding as is.

loreto avatar Jul 06 '23 14:07 loreto