ideas
ideas copied to clipboard
Datastore extension for Bonobo
Bonobo is data processing toolkit for building ETL graphs in Python.
It would be really neat if there were a Datastore extension, in a similar vein to the opendatasoft extension so that users build pipelines like
#from https://github.com/python-bonobo/bonobo/blob/0.2/bonobo/examples/datasets/coffeeshops.py
from os.path import dirname, realpath, join
import bonobo
from bonobo.ext.opendatasoft import OpenDataSoftAPI
OUTPUT_FILENAME = realpath(join(dirname(__file__), 'coffeeshops.txt'))
graph = bonobo.Graph(
OpenDataSoftAPI(dataset='liste-des-cafes-a-un-euro', netloc='opendata.paris.fr'),
lambda row: '{nom_du_cafe}, {adresse}, {arrondissement} Paris, France'.format(**row),
bonobo.FileWriter(path=OUTPUT_FILENAME),
)
if __name__ == '__main__':
bonobo.run(graph)
print('Import done, read {} for results.'.format(OUTPUT_FILENAME))
See also #211
FWIW I think this repo is closed now, most conversation has moved to https://github.com/ckan/ckan/discussions