datproject-discussions icon indicating copy to clipboard operation
datproject-discussions copied to clipboard

Dat Project ecosystem -- Open Data Tools

Open jbenet opened this issue 11 years ago • 1 comments

these are ideas, all liable to change

Dat Project -- Open Data Tools:

dat

  • see https://github.com/maxogden/dat

dat-webui - google docs meets IPython

  • frontend for dat
  • standalone webapp, talks to any dat endpoint url
  • (can ship with dat cli/server by default)
  • should be able to run dat commands in UI ?
  • should have a js repl
  • should be able to run dat commands in repl
  • should be able to run calculations on data + see output in repl
  • should be able to load modules from npm in repl
  • modules can be visualizations, calculations, transformers, etc.
  • (borrow from IPython/IJulia Notebooks)
  • hyper modular, so can be extended by people to make custom dat frontends

dathub - hosted dat instances + package metadata

  • dathub.org/jbenet/geocoder
  • (github for dat repos)
  • should be able to host any dat metadata
  • should help spin up and manage dat droplets
  • should expose dat endpoint (even if through a redirect) dathub.org/jbenet/geocoder/dat or instance or api or rpc or something
  • should be able to customize hosted instance location (my own dat endpoint)
  • should be able to manage hosted instance
  • should be able to authenticate to private blob backends
  • "host + collaborate on your data here"

datadex or dat-dex or dat-index - index/registry for data packages

  • (maybe merge with dathub as one thing)
  • (npm for data packages)
  • has to support arbitrary data packages (dats, blobs, git repos, whatever has a data-package.jsonld)
  • datadex.io/jbenet/geocoder
  • designed to complement arxiv
  • so maybe Ndex.org ? or just dex? dex publish foo
  • "publish/download versions of your data package here"
  • one massive shared blobstore for all data

ipfs - p2p file system, backend for index/registry

  • (super general so has its own use cases, but for dat)
  • backend for the index
  • could even be a backend for dat
  • massive dedup / p2p distribution

transformer

  • see https://github.com/jbenet/transformer
  • modular interface for arbitrary conversions
  • type / conversion modules on npm
  • browserify style: transformer csv json > csv2json.js
  • used with dat through hooks
  • also, cli: cat my.csv | transform csv json > my.json
  • transformer-website:
    • help users run transformations
    • show a dat-npm-transformer instance (filter the transformer modules)
    • transformer modules should run on dat-webui

jbenet avatar Jun 05 '14 02:06 jbenet

(git + github + pkg mgrs) = Source Code Collaboration

It may be useful to brand Dat this way:

Dat - Data Collaboration (or Data Collaboration Tools)

  • data version control
  • data package management
  • data distribution and streaming
  • data pipelines and services

Or something.

jbenet avatar Jun 24 '14 12:06 jbenet