cljs-bean Protocol based js<->clj key value mapping

trafficstars

This PR introduces the BeanContext, mainly for use in our malli-ts library to enable low overhead (simple benchmarks show ~10%) mapping of custom/namespaced keywords to javascript properties.

See also:

https://github.com/flowyourmoney/malli-ts/pull/9
Example of such a mapping in malli-ts unit tests [:order-item/test-dummy {::mts/clj<->js {:prop "TESTDummyXYZ"}} string?]

We have 2 reasons to make this a protocol:

Remove the overhead of allocating 3 partials on every access (prop->key, key->prop, transform)
Add (a lot) more context to transform, allowing us to lookup the malli schema for the key that needs to be transformed

To remain backwards compatible, we didn't change the transform option signature. Instead if a call to ->clj is supplied with the :context option, it's used instead.

The transform (transform [_ v p k n] ... gets called with:

value
js property
cljs keyword
nth index (array/vector access) or nil (map access)

In malli-ts we create a context implementation that wraps a "property/keyword → malli :map schema mapping" (also following :refs) and use the additional context to find the nested schema defined keyword mapping. Perhaps in the future we will add support for coercing values lazily.

The default options have a static reified instance, for example for the default keywordize behavior:

(def ^:private keywordize-ctx
  (reify BeanContext
    (keywords? [_] true)
    (key->prop [_ x] (default-key->prop x))
    (prop->key [_ prop] (keyword prop))
    (transform [ctx v _ _ _] (->val ctx v))))

Nov 19 '22 15:11 vizanto

Cool. Will take a look at this. Seems like a mixture of perf enhancements along with some new public API.

The new public API being committed is the bit that I'd like to best understand. Two immediate thoughts:

Was primitive? meant to be made public?
I suppose this would affect doc strings and other documentation.

Nov 19 '22 17:11 mfikes

primitive? is pretty useful, but we ended up not needing it anymore. We were also discussing to maybe move the gory details into an impl namespace and make everything public there.

I haven't updated the docstrings yet, but the only new option is :context.

Nov 19 '22 17:11 vizanto

@vizanto Are there new public types being added? Can K-Transform, etc. be made private?

I suppose BeanContext and bean-context are meant to be public... they seem to be at the core of this PR.

Nov 19 '22 17:11 mfikes

Made the types private now, bean-context would be useful to keep public so that one can reuse the instance

Nov 19 '22 17:11 vizanto

I'm attempting to get my mind around the problems that this solves, or the extra ability that this affords by way of example use.

Let's say you had things set up where you had a couple of mapping functions (these are taken from the unit tests):

cljs.user=> (require '[cljs-bean.core :refer [->clj ->js]])
nil
cljs.user=> (defn prop->key [prop]
  (cond-> prop
    (some? (re-matches #"[A-Za-z_\*\+\?!\-'][\w\*\+\?!\-']*" prop)) keyword))
#'cljs.user/prop->key
cljs.user=> (defn key->prop [key]
  (cond
    (simple-keyword? key) (name key)
    (and (string? key) (string? (prop->key key))) key
    :else nil))
#'cljs.user/key->prop

And you had a value you wanted to convert from JavaScript to ClojureScript:

cljs.user=> (def js #js [#js {:a 1, "a/b" 2, "d" 3 "v" #js [#js {:c 2 "d" 4 "x/y" 7}]}])
#'cljs.user/js

With this setup, the default mapping would look like:

cljs.user=> (->clj js)
[{:a 1, :a/b 2, :d 3, :v [{:c 2, :d 4, :x/y 7}]}]

You can override it with the existing public API as follows:

cljs.user=>(->clj js :prop->key prop->key :key->prop key->prop)
[{:a 1, "a/b" 2, :d 3, :v [{:c 2, :d 4, "x/y" 7}]}]

If you wanted to define a single context value that encapsulates all of this, with the current public API you could do

cljs.user=> (def ctx {:prop->key prop->key :key->prop key->prop})
#'cljs.user/ctx

and then achieve the same result by simply passing this context map:

cljs.user=> (->clj js ctx)
[{:a 1, "a/b" 2, :d 3, :v [{:c 2, :d 4, "x/y" 7}]}]

Is it mainly that this PR, by introducing a protocol, speeds up things by 10%? (Really, a perf benefit.)

Or is having a protocol available adding some fundamental new capability over a map-based (or keyword abs-based) approach? I'm trying to think of new use cases that this enables.

(Even if it doesn't enable new use cases, it would be interesting if using a protocol speeds things up... an attempt was done with the ClojureScript compiler itself in the past where its internal data structure, which is a map, was heavily "protocolized" in an experiment to see if it sped up compilation, but interestingly the results with that experiment were negative.)

Nov 19 '22 21:11 mfikes

The new capability is in the links in the first post, the added context to transform.

The speedup is not 10% but the total overhead is now 10% (in our malli-ts test case)... speedup itself is much larger. You could compare the version with context added to 3 partial functions.

We can't use a single context map as we allow the maps in vectors in maps to have different js properties map to different keywords. On 20 Nov 2022, 01:10 +0400, Mike Fikes @.***>, wrote:

I'm attempting to get my mind around the problems that this solves, or the extra ability that this affords by way of example use. Let's say you had things set up where you had a couple of mapping functions (these are taken from the unit tests): cljs.user=> (require '[cljs-bean.core :refer [->clj ->js]]) nil cljs.user=> (defn prop->key [prop] (cond-> prop (some? (re-matches #"[A-Za-z_*+?!-'][\w*+?!-']" prop)) keyword)) #'cljs.user/prop->key cljs.user=> (defn key->prop [key] (cond (simple-keyword? key) (name key) (and (string? key) (string? (prop->key key))) key :else nil)) #'cljs.user/key->prop And you had a value you wanted to convert from JavaScript to ClojureScript: cljs.user=> (def js #js [#js {:a 1, "a/b" 2, "d" 3 "v" #js [#js {:c 2 "d" 4 "x/y" 7}]}]) #'cljs.user/js With this setup, the default mapping would look like: cljs.user=> (->clj js) [{:a 1, :a/b 2, :d 3, :v [{:c 2, :d 4, :x/y 7}]}] You can override it with the existing public API as follows: cljs.user=>(->clj js :prop->key prop->key :key->prop key->prop) [{:a 1, "a/b" 2, :d 3, :v [{:c 2, :d 4, "x/y" 7}]}] If you wanted to define a single context value that encapsulates all of this, with the current public API you could do cljs.user=> (def ctx {:prop->key prop->key :key->prop key->prop}) #'cljs.user/ctx and then achieve the same result by simply passing this context map: cljs.user=> (->clj js ctx) [{:a 1, "a/b" 2, :d 3, :v [{:c 2, :d 4, "x/y" 7}]}] Is it mainly that this PR, by introducing a protocol, speeds up things by 10%? (Really, a perf benefit.) Or is having a protocol available adding some fundamental new capability over a map-based (or keyword abs-based) approach? I'm trying to think of new use cases that this enables. (Even if it doesn't enable new use cases, it would be interesting if using a protocol speeds things up... an attempt was done with the ClojureScript compiler itself in the past where its internal data structure, which is a map, was heavily "protocolized" in an experiment to see if it sped up compilation, but interestingly the results with that experiment were negative.) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.**>

Nov 19 '22 21:11 flow-danny

Ahh, right... somehow I missed that this is more than just "protocolizing" things... it is the extra stuff going on with transform which is a big part of the PR.

Feels like two (separate?) things going on (sorry for being slow to comprehend the big picture):

Whereas the existing :transform function takes a single argument, this PR is conceptually extending that, providing much more context that allows the transform to be more refined by leveraging that additional context.
Introducing a protocol that can be used instead of a keyword args

The first appears to be mostly about new functionality and the second appears to be mostly about perf. Not suggesting this, but it makes me wonder if these could in theory be separate PRs. But, perhaps it is difficult to separate the two things because the protocol aspect is helping with the "extension of the transform" aspect.

Nov 19 '22 22:11 mfikes

The first commit in this PR is indeed passing the context as extra arguments to the existing transform option, but that would break existing library users. (transform arity change)

The added performance benefit and making it a non breaking change is why we decided to open the PR 😀

Nov 20 '22 09:11 flow-danny

If there is a clean way to supply additional contextual information to the transform function without breaking existing clients, that would be a small extension to the public API, much easier to assess, etc.

The notion of leveraging protocols for performance could be treated as a completely independent thing.

Nov 20 '22 12:11 mfikes

Do you have any projects where you could measure a change in performance with this branch? I'm curious 😄

Nov 20 '22 12:11 vizanto

@vizanto No I don't have any such projects.

Nov 20 '22 13:11 mfikes

cljs-bean cljs-bean copied to clipboard

Protocol based js<->clj key value mapping

cljs-bean
cljs-bean copied to clipboard