malli icon indicating copy to clipboard operation
malli copied to clipboard

spec2 schema/select distinction

Open xificurC opened this issue 6 years ago • 9 comments

Rich talks extensively about making the distinction about a schema and about the required data within a schema when being used. A person can have many things and only some (different) parts of its data will be required in different functions. That's what the new select is about. Is there similar functionality in malli? I see {:optional true} in an example schema which is exactly what Rich found bad in his first design.

xificurC avatar Oct 24 '19 19:10 xificurC

Hi! It's the plan to implement select like spec2. There is an issue https://github.com/metosin/malli/issues/63

jeans11 avatar Oct 24 '19 20:10 jeans11

Excellent question! Have been thinking about that a lot and I don't think there is a correct answer for this. Some thoughts:

  • in Malli, it's all Schemas right now. Like @jeans11 pointed out, we should (and will) have good tools to program with schemas. One can easily run a complex transformation/select to a root schema and get a new "select" schema out. It might even be simpler to have just one concept (schema) instead of two (schema and select)? I would say select should be a function, not a concept. We have been programming with schemas in real-life project since 2014 with Plumatic Schema & Schema-tools and have been really happy with that.

  • with Malli, It's easy to make all fields of root schemas as required and thus to be compliant with "Maybe Not". There could even be a top-level option to removing support for :optional keys all-together. Maybe malli should be promoted to a library for creating schema libraries ;)

  • in real life, many databases and dynamic schema definition systems already support the optional keys (like the JSON Schema. Not supporting optionality at malli would mean converting to/from those formats would be hard if not possible. With Spec2, per my understanding one needs to create both a Schema and a Select for each JSON Schema Object definition: the Schema to define all the possible keys and select to define the required keys. Like Spec1 :opt, but information is now copied in two places. Also, as the closing of the specs seems to be going into call-site, not sure if there will be a "closed spec" concept, which JSON Schema (and mostly all programming languages) support.

  • optional key + nillable is actually ternary: we don't know, we know it is, we know it is not. Depending on the context, this might be valuable. With Statically Typed FP langs, I lean on Option | Maybe | Either a lot.

ikitommi avatar Oct 24 '19 21:10 ikitommi

I see you're thinking about this a lot, that's great.

So from your POV select creates a new, derived schema. I suppose that makes sense and makes things more general, everything's just a schema this way. Supporting the already existing standards is also important.

Once you finish pondering about this it would be nice to get this mentioned in the readme, people (like me) will be comparing this to spec2 and specifically to what lead to the creation of spec2.

xificurC avatar Oct 25 '19 07:10 xificurC

Revisiting this I see you already implemented select-keys, required-keys and optional-keys, cool!

I'm pondering on one thing:

(defn foo [{:keys [x y z] :or {z 1}}]
  (if (pos-int? x) (/ y z) 0))

(select-keys M [:x :y]) is not OK in this case because input like {:x 1 :y 2 :z 0} throws. To generate reasonable tests one would need (optional-keys M [:z]). However if the model is huge we really want to narrow down the generation to just these three keys with :z being optional. That would be (-> M (select-keys [:x :y :z]) (optional-keys [:z])).

I think for my own data I would model everything as required, that would describe the "shape" of the data and then chop things from that with select-keys and optional-keys to get to the correct subset.

Does this sound reasonable?

I'm also thinking whether there could be an API to capture the 2 operations in 1 swoop, like select-keys taking another argument for optionals? (select-keys M [:x :y] [:z]).

What about nested maps or a collection of maps? Maybe one could devise a descriptive data-driven DSL to capture the requirements, like

{:x :! :y :! :z :?}                    ; :x :y and optionally :z
{:x :! :y :! :z {:? [{:a ! :b !}]}}    ; optional :z where it is a collection of required :a and :b

Take this for what it is, a brain dump :)

xificurC avatar Feb 20 '20 11:02 xificurC

a standard query format like EQL might be the way to go.

ikitommi avatar Feb 21 '20 13:02 ikitommi

I thought about pull syntax, which is close to EQL, before posting, but I don't see a way to describe the required/optional part

xificurC avatar Feb 21 '20 15:02 xificurC

Good point. One though I had in mind was to support transforming key optionality using the map-entry syntax.

(mu/assoc-in nil [:a [:b {:optional true}]] int?)
; => [:map [:a [:map [:b {:optional true}] int?]]]]

would work also with mu/select-keys and could be used with the nested select thing?

ikitommi avatar Feb 21 '20 16:02 ikitommi

A map is a good idea as it leaves space for future additions. One could want to e.g. define the probability of a key ({:p 0.9} to include this key with 90% probability) and other conditional logic.

xificurC avatar Feb 21 '20 19:02 xificurC

For a pet-project I implemented schema-select that allows you to create a sub-schema using the spec2 select syntax 1.

It has the limitation that it only accepts a map-schema to select from.
Also, the traversal of nested composites is pretty simplistic: given [{:child [:att]}] it simply picks the first matching path out of [:child :att], [:child :malli.core/in :path], [:child 0 :path] and [:child 1 :path]. But it was Good Enough™️ for my usecase :)

eval avatar Dec 18 '20 12:12 eval