datproject-discussions
datproject-discussions copied to clipboard
Differences/relationships between Gasket, Datscript/Datfile and hackfile
Hi there,
I have seen Datscript/Datfile, hackfile and Gasket and I will be interesting in contributing to a data pipelines description framework.
For what I have understood, datscript instructions convert to json. Is this JSON to be used with Gasket (not sure)? Then what if hackfile for ?
What are the differences / relationships between the 3 of them ? Is there a common spec or sth ?
// sorry to ask many questions, I am trying to figure out where the whole dat project is going !
Besides, this datscript pseudo-code looks really good : https://github.com/datproject/datscript/blob/master/example-bionode.ds
@clemsos Hi, thanks for the question.
Hackfile: The parseable text syntax, e.g.
foo bar baz
sub thing
another sub thing
sub sub thing
another thing
Datscript: A 'flavor' of Hackfile that contains special commands for data pipelines
Gasket: A node program that runs data pipelines, but expects JSON input
Implementations:
- https://github.com/mafintosh/hackfile - Parse Hackfiles into low level JSON
- https://github.com/melaniecebula/datscript - Parses Datscript (using Hackfile parser) and outputs Gasket compatible JSON
- https://github.com/datproject/gasket - Run data pipelines from Gasket JSON
- https://github.com/mafintosh/datscript.sh - Run Datscript files using pure Bash
Status:
We are working on the Dat beta and haven't been working on these tools a ton until we get lower level Dat stuff out of the way. We are interested in feedback and contributions though!
@maxogden ok I see. Dat pipelines are interesting, I'll keep an eye on that. Ping me when you want to revive it. Waiting for the dat beta too ! I have many projects with various datasets coming so I will definitely test it. thanks for the good work !