datproject-discussions
datproject-discussions copied to clipboard
Dat Project ecosystem -- Open Data Tools
these are ideas, all liable to change
Dat Project -- Open Data Tools:
dat
- see https://github.com/maxogden/dat
dat-webui - google docs meets IPython
- frontend for dat
- standalone webapp, talks to any dat endpoint url
- (can ship with dat cli/server by default)
- should be able to run dat commands in UI ?
- should have a js repl
- should be able to run dat commands in repl
- should be able to run calculations on data + see output in repl
- should be able to load modules from npm in repl
- modules can be visualizations, calculations, transformers, etc.
- (borrow from IPython/IJulia Notebooks)
- hyper modular, so can be extended by people to make custom dat frontends
dathub - hosted dat instances + package metadata
-
dathub.org/jbenet/geocoder - (github for dat repos)
- should be able to host any dat metadata
- should help spin up and manage dat droplets
- should expose dat endpoint (even if through a redirect)
dathub.org/jbenet/geocoder/dator instance or api or rpc or something - should be able to customize hosted instance location (my own dat endpoint)
- should be able to manage hosted instance
- should be able to authenticate to private blob backends
- "host + collaborate on your data here"
datadex or dat-dex or dat-index - index/registry for data packages
- (maybe merge with dathub as one thing)
- (npm for data packages)
- has to support arbitrary data packages (dats, blobs, git repos, whatever has a data-package.jsonld)
-
datadex.io/jbenet/geocoder - designed to complement arxiv
- so maybe Ndex.org ? or just dex?
dex publish foo - "publish/download versions of your data package here"
- one massive shared blobstore for all data
ipfs - p2p file system, backend for index/registry
- (super general so has its own use cases, but for dat)
- backend for the index
- could even be a backend for dat
- massive dedup / p2p distribution
transformer
- see https://github.com/jbenet/transformer
- modular interface for arbitrary conversions
- type / conversion modules on npm
- browserify style:
transformer csv json > csv2json.js - used with dat through hooks
- also, cli:
cat my.csv | transform csv json > my.json - transformer-website:
- help users run transformations
- show a dat-npm-transformer instance (filter the transformer modules)
- transformer modules should run on dat-webui
(git + github + pkg mgrs) = Source Code Collaboration
It may be useful to brand Dat this way:
Dat - Data Collaboration (or Data Collaboration Tools)
- data version control
- data package management
- data distribution and streaming
- data pipelines and services
Or something.