malli
malli copied to clipboard
Add support for constraint schemas: This, that, or both keys in map
I'm just getting started with Malli, and I really like it!
There is something I can't figure out.
Consider this schema:
(def Schema
[:map
[:x boolean?]
[:y {:optional true} int?]
[:z {:optional true} string?]])
(m/validate Schema {:x true}) => true
Now imagine I want to require :y
, :z
, or both such that:
(m/validate Schema {:x true}) => false
(m/validate Schema {:x true :y 1}) => true
(m/validate Schema {:x true :z "hi"}) => true
(m/validate Schema {:x true :y 1 :z "hi"}) => true
Attempt 1
(def Schema
[:map
[:x boolean?]
[:or
[:y {:optional true} int?]
[:z {:optional true} string?]]])
(m/validate Schema {:x true})
; Execution error (ExceptionInfo) at malli.impl.util/-fail! (util.cljc:17).
; :malli.core/invalid-schema {:schema :z}
Attempt 2
(def Schema
[:or
[:map
[:x boolean?
:y int?]]
[:map
[:x boolean?
:z string?]]
[:map
[:x boolean?
:y int?
:z string?]]])
(m/validate Schema {:x true}) => true
That's not good!
Attempt 3
(def Schema
[:and
[:map [:x boolean?]]
[:or
[:map [:y int?]]
[:map [:z string?]]
[:map [:y int? :z string?]]]])
(m/validate Schema {:x true}) => false
(m/validate Schema {:x true :y 1}) => true
(m/validate Schema {:x true :z "hi"}) => true
(m/validate Schema {:x true :y 1 :z "hi"}) => true
This seems to work, but there's a bit of repetition.
The errors are pretty good, but perhaps they could be better if I did this the right way:
(me/humanize (m/explain Schema {:x true}))
=>
{:y
["missing required key"
"missing required key"],
:z ["missing required key"]}
(me/humanize (m/explain Schema {:x true :y "hi"}))
=>
{:y
["should be an int"
"should be an int"]
:z ["missing required key"]}
What's the best way to do this?
Current solution:
(def Schema
[:and
[:map
[:x boolean?]
[:y {:optional true} int?]
[:z {:optional true} string?]]
[:fn {:error/message "y & z are mutually exclusive"
:error/path [:y]}
(fn [{:keys [z y]}]
(not (and z y)))]])
(-> Schema
(m/explain {:x true :y 1 :z "hi"})
(me/humanize))
; => {:y ["y & z are mutually exclusive"]}
Future: a declarative way to do this, something like https://github.com/bsless/malli-keys-relations
actually, this is duplicate of #438. As this has more info, will close the other one.
What do you think about a datalog-like syntax for describing the constraints? Something like a simplified :where
clause
It's serializable and simple to compile
I was thinking of using meander. I think both could be tested "on paper" for few sample cases to get more insight on how they would/not work. Code syntax samples into this thread?
Sure, I'll whip up a few examples
I considered two notations, one as property and one as a standalone schema:
[:map
{:where
'[[?e :x ?x]
[?e :y ?y]
[(< ?x ?y)]]}
[:x int?]
[:y int?]]
[:and
[:map
[:x int?]
[:y int?]]
[:where
'[[?e :x ?x]
[?e :y ?y]
[(< ?x ?y)]]]]
I'll with the property notation for brevity.
I'm also missing a syntax for binding the original value being queried, let's go with :in
for now
querying keys in a map:
[:map
{:in '?e
:where
'[[?e :x ?x]
[?e :y ?y]
[(< ?x ?y)]]}
[:x int?]
[:y int?]]
Binding intermediate values:
[:map
{:in '?e
:where
'[[?e :x ?x]
[?e :y ?y]
[(count ?x) ?n]
[(< ?n ?y)]]}
[:x [:vector int?]]
[:y int?]]
Nested maps:
[:map
{:in 'e?
:where
'[[?e :x ?x]
[?x :z ?z]
[?e :y ?y]
[(== ?z ?y)]]}
[:x [:map [:z int?]]]
[:y int?]]
"every" syntax:
[:map
{:in '?e
:where
'[[?e :x [?x :as ?xs]]
[?x :z ?z]
[?e :y ?y]
[(count ?xs) ?n]
[(< ?x ?n)]
[(== ?z ?y)]]}
[:x [:vector [:map [:z int?]]]]
[:y int?]]
Implicit unification:
[:map
{:in '?e
:where
'[[?e :x [?x :as ?xs]]
[?x :z ?y]
[?e :y ?y]
[(count ?xs) ?n]
[(< ?x ?n)]]}
[:x [:vector [:map [:z int?]]]]
[:y int?]]
Mutual exclusion (can't have a situation where both 2 and 3 succeed), this is also a "contains" syntax.
[:map
{:in '?e
:where
'[[?e :x ?x] ;1
(not
[?e :y] ;2
[?e :z])]} ;3
[:x int?]
[:y int?]
[:z int?]]
I tried to introduce as few innovations as possible with this syntax. What do you think?
Thanks for your thoughts @bsless! I think the separate :where
type looks better of the two. About using datalog, that would be powerful as the nested sample shows. As it's generic, it could be used for sequences too, a generic matching on the data.
But, not ready on committing to an official malli relations-mapping solution yet, so if you would like to implement that, it could start as an add-on lib. Will try to resolve the global registry so that it's easy to plug-in custom schema types from 3rd party libraries.
cheers.
Sounds good. One issue I'd appreciate directions on is how to compile it to be performant. A naive approach would be building up a map of all the bindings, but where's the fun in that? Can it be done with functions? Do I need some CPS magic?
not an datalog perf expert, there is #datalog chan in clojurians, maybe people there can share insights. I believe Asami has a fast imp (https://github.com/threatgrid/asami)
I figured I can use malli to parse the query :)
First not so successful attempt at translating the datalog spec directly to malli
Not so successful because ::clause
can't be built into a schema.
Any tips @ikitommi ?
https://gist.github.com/bsless/632b4040a2b2ad7469369f52cd610c06
Update: the parser works. Data patterns are a bit too general so I'm considering adding specialized EA and EAV patterns but they're slightly ugly. Regarding execution model: I'll start with some naive linear interpreter but eventually I'll want to add some sort of planner and compiler. Any suggestions regarding modeling are welcome
I also ran into this problem, and we discussed it on slack.
I tried something like
[:union
[:or
[:map [:data map?]]
[:map [:dependencies {:optional true} [:set [:ref "dependency"]]]]]
[:map [:type {:optional true} string?]]]
but that didn't work out. I also tried a lot of stuff with merge
but it didn't work as expected and I just didn't really understand the logic of something like this:
[:and
[:map [:x boolean?]]
[:or
[:map [:y int?]]
[:map [:z string?]]
[:map [:y int? :z string?]]]]
This seems to say: a map with property :x, and a map with property :y, a map with property :z or a map with property :y and property :z. What I would naively expect is more like the following (here I create a naive address schema that can either have a zip code and house number or a city, street name and house number):
[:map
[:house-number number?]
[:or
[:zip-code string?]
[:and
[:city string?]
[:street-name string?]]]]
The top level map suggests that everything here happens in the context of a single map. So:
- house number should always be there.
- further more: there should be either a :zip-code or a :city ánd a :street-name
This kind of syntax would honor the principle of least astonishment (dear to my heart) for me, but Tommi pointed out that te problem with this is that Malli can in this case not distinguish :or & :and from properties in the map. I guess a combination of :or and :map is used for that. But as again Tommi pointed out this could be fixed with name-spacing all Malli keywords. That seems like a good idea all round.
so, for completeness:
[::m/map
[:house-number number?]
[::m/or
[:zip-code string?]
[::m/and
[:city string?]
[:street-name string?]]]]
To me this kind of syntax feels most familiar, it is what I would expect to have to write.