trafficstars

Context

Data Packages, and Frictionless Data specifications, are essentially part of the protocol and inner workings of this package, but the package itself does not really require knowledge of these specs. Branding it as "Data Packages" is misleading, and potentially would confuse users into thinking that a knowledge of data packages is required to use this package.

I think we should rename the package to simply pipelines or pipeline.

What are your thoughts @akariv

Any opinion on this @brew @roll @danfowler @vitorbaptista @amercader @rufuspollock ?

Jun 27 '17 13:06 pwalsh

Is it recognizable enough as a pipeline(s)? Because it's just kinda a common word. Maybe data-pipeline(s)

Jun 27 '17 14:06 roll

I found I needed a passing understanding of data packages to understand how to write pipelines and processors.

Jun 27 '17 15:06 brew

I agree that pipeline(s) is probably too generic.

packaging-pipeline(s)
out there suggestion: conveyor-belt (conveyor belts can move packages from A to B)

Jun 28 '17 04:06 danfowler

Thinking it over, while it is true that @brew (and anyone else who wants to write their own processors) needs to know how Data Packages work in order to work with this tool, the most common user should be someone who (1) knows where her data is, (2) what it looks like, (3) where she wants to put it, and (4) how to write YAML. Thinking back to the first sentence of the README:

datapackage-pipelines is a framework for declarative stream-processing of tabular data. It is built upon the concepts and tooling of the Frictionless Data project.

Maybe the name should reference the most important type of data it ingests and spews (tables): table-pipelines, tabular-pipelines, tabular-data-pipelines, table-factory, table-streams, tabulator-streams (probably worth emphasizing in the README the connection to tabulator)...

Jul 04 '17 03:07 danfowler

I've thought more and more and more like the factory concept (pipelines just suggesting moving water around - not processing it). Conveyor belts may be a bit passive but they have the factory sense. Maybe more of an assembly line ...

Jul 05 '17 17:07 rufuspollock

But pipeline is an established concept in programming

Jul 06 '17 08:07 roll

@roll good point i.e. unix pipes. However this is a bit more heavy duty than classic pipes but i think you're right.

Jul 06 '17 16:07 rufuspollock

@akariv @rufuspollock @jobarratt

I've come to think this is a crucial step to take, sooner rather than later.

Candidates:

Data Workflows: CLI dw
Data Pipelines: CLI dp
Data Factory: CLI df
Data Flows: CLI df
Others?

Any of these will be better than the current name, and all address the above concerns that excluding "Data" (e.g.: pipelines) is confusing.

I'm happy to take a decision if needed, but I prefer to have @akariv take the call on this if he desires, as the author of the framework :).

Sep 05 '17 05:09 pwalsh

Name for plugins are superlong (cc @brew). I was against pipelines because it's not specific enough in my mind. But it there other one word alternatives? datapipe(s)?

PS. E.g. datapackage-pipelines-sourcespec-registry

Sep 05 '17 05:09 roll

I added another option to https://github.com/frictionlessdata/datapackage-pipelines/issues/69#issuecomment-327076707 after chatting with @rufuspollock yesterday: shortening "Data Workflows" to "Data Flows"

Sep 06 '17 05:09 pwalsh

I also had in mind dataflow(s) as something short enough: dataflow-aws, dataflow-goodtables etc

Sep 06 '17 06:09 roll

i like Data Flows or Data Pipelines. Pipelines marginally more because it's more reminiscent of infrastructure big

Sep 06 '17 07:09 jobarratt

DPP isn't just moving data from one point to another, but also transforming, changing and filtering it. Not sure how that helps, but perhaps it's more of an assembly line, than a pipeline? Having said that, I like Data Flows and Data Pipelines. If I have to describe the package to someone, I say it's a data pipeline framework.

edit: Ha! @rufuspollock said exactly this about assembly lines already.

Sep 06 '17 11:09 brew

@brew if you like assembly lines we're probably a factory and our current pipelines are some combination of machines connected by conveyors ;-)

Sep 06 '17 20:09 rufuspollock

datapackage-pipelines
datapackage-pipelines copied to clipboard

Rename this library

Context

datapackage-pipelines datapackage-pipelines copied to clipboard

Rename this library

Context

datapackage-pipelines
datapackage-pipelines copied to clipboard