Version3 and Version5 Uuid's use unknown string encoding for non ASCII
Comparing V3 UUID generated from this: https://github.com/Faithlife/FaithlifeUtility/blob/master/src/Faithlife.Utility/GuidUtility.cs vs https://www.uuidtools.com/generate/v3
namespace: 33b23789-087d-458a-91be-0458615b9b77 value: lol test balle|lll|lll|æøå uuidtools: 6fcd6c2b-e356-34ff-a146-cb392dbe242a Faithlife, using Encoding.UTF8.GetBytes(value): 12fc7835-e25c-3ffe-abe9-a73aa487dd57 Faithlife, using Encoding.ASCII.GetBytes(value): 6dbd69f8-5aa0-3f94-acf6-8f72a0664856 Faithlife, using raw bytes, value.Select(char => (byte)char).ToArray(): 6fcd6c2b-e356-34ff-a146-cb392dbe242a So it seems uuidtool uses raw (or no) encoding.
So if using a value with only ASCII, then its ok: value: lol test balle|lll|lll|ok uuidtools: 5559e297-b54a-3ba6-b9ff-82b6bfb01683 Faithlife, using Encoding.UTF8.GetBytes(value): 5559e297-b54a-3ba6-b9ff-82b6bfb01683 Faithlife, using Encoding.ASCII.GetBytes(value): 5559e297-b54a-3ba6-b9ff-82b6bfb01683 Faithlife, using raw bytes, value.Select(char => (byte)char).ToArray(): 5559e297-b54a-3ba6-b9ff-82b6bfb01683 Yes.
Maybe it could be informed about somewhere in the web page. Maybe it could be allowed to choose encoding. But if I try with a value with some really "weird" char, like 𠜎? value: lol test balle|lll|lll|𠜎 Now I get no UUID generated at all (hangs). If I try the same value with Faithlife, using Encoding.UTF8.GetBytes(value): 41f06a67-bc17-3067-8976-52b9e4219fd1 Faithlife, using Encoding.ASCII.GetBytes(value): 8e770939-4bc5-364d-a646-2d600d2bcfb7 Faithlife, using raw bytes, value.Select(char => (byte)char).ToArray(): f2d0b3aa-7b8a-32a3-b7b5-722484dc6370 So uuidtools hangs and give no value, while all 3 different "encodings" give different value. Not good.
Proposal (one or more): -Write what encoding it used -Make it possible to choose encoding -Always use UTF8 encoding, or at least make it default (UTF8 is used everywhere today, this is not controversial IMO) -Do not hang on "weird" chars
Thanks for the great tool!
I'm experiencing similar issues. When I wanted to get a UUID v5 for the Word Français in the Webgui I got another result than from my python script. I quickly found out that python was converting UTF-8 into base64 correctly (RnJhbsOnYWlz) but on the other hand uuidtools will generate my Word into different base64 (RnJhbudhaXM=) which you can find out if you click on "Copy API Call". Decoding RnJhbudhaXM= in UTF-8 will not lead to Français, however it will work if I decode it in latin-1 (ISO-8859). This led me to the conclusion that uuidtools is in fact not based on UTF-8 encoding. It would be a major upgrade if this were documented, or even better, implemented in the tool so users could choose their preffered encoding.