datahike icon indicating copy to clipboard operation
datahike copied to clipboard

Idiomatic way to express 'sum type' (db/valueType) spec?

Open dgb23 opened this issue 4 years ago • 8 comments

I'm trying to programmatically convert json-schemas into datahike schemas.

Please note that I don't have a particular model/schema in mind for my use-case. I just want to generate a sensible, predictable datahike schema from an arbitrary json-schema.

An issue I have is that it isn't obvious to me how to convert sum-type-like definitions like so:

{"type": ["integer", "string"]}

In this case, a scalar type can be either an "integer" or a "string". Which reminds me of clojure spec or.

Should I extend the datahike schema spec with an additional :db/valueType? If yes, what is a sensible approach to that? Can the schema spec be modified via the datahike interface?

Or is it more sensible/idiomatic to for example split up the schema definition into multiples by defining :db/type/ref and then define one :db/ident per possible json-type with each having a concrete :db/valueType?

Thank you for reading my question!

dgb23 avatar Jun 03 '20 22:06 dgb23

Just my two cent: I do see that you are trying to cope with something given to you here that you might not have any control over. But is this type not complected? In real software I would try to implement this as two different db attributes and do the union in the query, not in the type.

After some more thought it might even be a ref. :item/price might be a ref that points to sometimes to :instock/out-of-stock-string and sometimes to :item/dollar-amount-integer.

markusalbertgraf avatar Jun 04 '20 08:06 markusalbertgraf

@markusalbertgraf thank you, this seems to be the right mode of thinking to approach this. Modelling with this type of database seems to be a matter of nuclear fission so to speak. It is very similar to a graph-db (like Neo4j for example).

Your use of the word 'complected' in this context made me realize that my mental block comes from thinking of types and conglomerates as you would in a statically typed language or an SQL database, instead of thinking them as composites of datoms.

Thank you for this pointer. I need to tinker a bit with this approach.

If anyone is interested: I'm using this test repository as a reference to figure this translation out:

JSON-Schema-Test-Suite

dgb23 avatar Jun 04 '20 13:06 dgb23

@dgb23 Interesting work, I think there are probably use cases for sum types, but I would have to think more of the trade-offs first.

As a sidenote: We are also currently planning to map RDF to Datahike's edn, maybe through https://github.com/ontodev/edn-ld. It would also be beneficial to have a JSON interface to datahike-server to have reach for our REST interface. @kordano is working on that.

whilo avatar Jun 04 '20 22:06 whilo

Any update on the RDF mapping?

jbadeau avatar Feb 12 '21 18:02 jbadeau

@jbadeau We are still working on it. What specifically would you be interested in?

whilo avatar Feb 15 '21 03:02 whilo

@whilo I am looking for a lightweight RDF store to implement an OSLC server. All of the usual suspects are way to heavy.

jbadeau avatar Feb 18 '21 10:02 jbadeau

i also would like to make use of temporal aspects for metrics/kpis

jbadeau avatar Feb 18 '21 10:02 jbadeau

@jbadeau That sounds interesting :+1: . I am not sure what you precisely need, but edn already has RDF semantics basically. You can transact fully namespace qualified data about entities into Datahike.

whilo avatar Feb 27 '21 04:02 whilo