Unable to use arbitrary uuid versions with Ash.Type.UUID
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
AI Policy
- [x] I agree to follow this project's AI Policy, or I agree that AI was not used while creating this issue.
Versions
Due to the change in #2373 which added strict cast_input for both Ash.Type.UUID and Ash.Type.UUIDv7, it is no longer possible to use arbitrary binary content for UUIDs.
#2373 allows only uuids with a version byte of 1..5 or 7 to be used with Ash.Type.UUID, and prevents anything but version 7 UUIDs from being used with Ash.Type.UUIDv7.
If a user wants to remove this restriction, they may, in each of their projects, instead provide their own copy of Ash.Type.UUID with the cast_input modified to match that capabilities of the underlying datastore (which as far as I can tell always allow any UUID-type data without trying to validate versions).
Validation like this was previously added and reverted in Ecto. https://github.com/elixir-ecto/ecto/issues/2147#issuecomment-321381804 has a description of why Ecto decided it was wrong for them to attempt to validate UUIDs. Not all the logic there applies completely to Ash, but some of it does.
Given how these Ash.Type.* are used, it makes sense to make separate types that add these stricter validations for those that want them, but continue to support a base-case UUID type.
This issue made upgrading a project I work on from an older version of Ash more complicated than it could have been.
Operating system
MacOS
Current Behavior
UUIDs with unrecognized versions raise exceptions:
Unknown Error
* ** (ArgumentError) cannot load `<< ... >>` as type #Ash.Type.UUID.EctoType<[]> for field :xxx in %A.B.C{xxx: nil}
Reproduction
Create a Ash.Type.UUID attribute in a resource. Use a version that doesn't match 1..5 or 7.
Expected Behavior
Accept the UUID instead of checking the version.
Interesting. I can see a case why we may not want to validate them, but there is a bit of a complexity here which is that casting input in Ash is used for a lot more things than ecto. The reason this change happened in the first case is that there was some code that was handling search strings that did something like this:
case Ash.Type.cast_input(:uuid, value) do
{:ok, value} -> ...
end
And it getting into the {:ok, value} case for 16 byte email addresses and so was assumed to be a UUID and then was handled differently as a search value.
So there is clearly some value in my mind to validating that a given binary is in fact a UUID, and we actually use this kind of information when building filters for example, i.e if id == ^id, we will attempt to type cast id to the corresponding type and a different error will be built otherwise.
My leaning would be to allow providing a constraint to these types that says something like loose_enforcement?: true (need a better name) that indicates for UUID/UUIDv7 that it will allow any given 16 byte binary.
Hmm... one can still construct a email address that the current code will consider a UUID:
iex(69)> a = <<"123456A", 0xc2, 0xa7, "@abc.us">>
"123456A§@abc.us"
iex(70)> Ash.Type.cast_input(:uuid, a)
{:ok, "31323334-3536-41c2-a740-6162632e7573"}
iex(71)>
Requires a single UTF-8 character in the address, and a small domain, but it's possible.
That makes me wonder if this kind of pattern (passing around uuids as an bare bitstring and then trying to guess if it's a uuid) is a good idea. In other language ecosystems, I'd probably create a wrapper type around the uuid representation so it can't be confused with strings.
🤔 that is an interesting point. So in reality it sounds like our search case we should not be using the binary casting at all, but should be doing some kind of regex match on the text representation instead of matching on the binary.
And yeah you're right, it feels like it doesn't make much sense for this:
iex(69)> a = <<"123456A", 0xc2, 0xa7, "@abc.us">>
"123456A§@abc.us"
iex(70)> Ash.Type.cast_input(:uuid, a)
{:ok, "31323334-3536-41c2-a740-6162632e7573"}
to work. It's obvious now that you point it out 😅. Does that work w/ Ecto's uuid cast type?? Surprised its the first time I'm considering this 🤔
So probably the answer is to say that you cannot rely on this casting feature to detect if something is a uuid, and should instead be looking at the text representation to do that. Only problem is that this then becomes a breaking change in the other direction than the original change we made, going from strict -> loose, and maybe we should be queuing this up for 4.0 instead of pulling the rug again.