csv-schema
csv-schema copied to clipboard
Similar Frictionless Data stuff
This is great and wondering if you have seen the Frictionless Data tooling for this - I think we could join forces.
To start with there is JSON Table Schema:
http://dataprotocols.org/json-table-schema/
Then there is associated tooling e.g. this JS lib infers a JSON Table Schema from CSV:
https://github.com/okfn/json-table-schema-infer
Here's a user interface that does schema inference along with data package generation in the browser:
http://datapackagist.okfnlabs.org/ https://github.com/frictionlessdata/datapackagist
Then there's stuff that generates relevant SQL etc from JSON Table Schema e.g.
https://github.com/frictionlessdata/jsontableschema-sql-py
/cc @pwalsh
Thanks @rgrp! @harrisj also suggested supporting JSON Table Schema in #2. I'm familiar with the schema (and encouraged the team that built ckanext-dictionary to use it), but I hadn't seen Frictionless Data or all of those tools. Sounds like a neat project, and I'd love to join forces.
It looks like json-table-schema-infer is along the lines of this project's detectType
and determineWinner
functions in util.js. I can definitely see the value in that being a standalone library that can be plugged into a web interface (and also perhaps used as a CLI). That could then pipe to another module that converts JSON table schema to various types of SQL, as you've done in jsontableschema-sql-py (but in javascript to support the browser).
I'd have to confirm that JSON Table Schema supports things like whether a field is nullable, and custom types like ST_Geometry
, but at first glance it sounds like a great idea.
@timwis nullable is supported. ST_Geometry I am not sure about but if not we can think about adding. Current list of supported types is here:
http://dataprotocols.org/json-table-schema/#field-types-and-formats
Note that JTS is extensible in that you could add your own custom JSON properties esp for special SQL type stuff.
@rgrp Yeah I was thinking the interface would allow you to select "Other" as a field type and key in your own custom value. I'll look into how that would be stored in JTS.
Regarding null, maybe I'm misreading but it seemed like with JTS you'd have to say that the "field type" is "null" rather than "it's a string that can be null" like you would in SQL?
Oh, I see - "nullable" would be the required
constraint
@timwis that's exactly right.