datapackage-pipelines icon indicating copy to clipboard operation
datapackage-pipelines copied to clipboard

Allow plugins to provide custom spec parsers

Open brew opened this issue 9 years ago • 3 comments

Currently, plugins can optionally provide one Generator class. And one my_plugin.source-spec.yaml filetype per generator. This means each *.source-spec.yaml filetype requires its own plugin, ~~and generators in separate plugins can't share common processors~~.

I propose letting plugins provide their own custom spec parsers that extend parsers.base_parser.BaseParser. This would allow plugins to resolve source-specs and generators in their own way, potentially allowing plugins to provide more than one generator type, subsequently allowing more than one *.source-spec.yaml filetype per plugin.

For example, the datapackage_pipelines_measure plugin could have a social-media generator, a website-analytics generator, a code-packaging generator, etc. And each project directory could contain the corresponding social-media.measure.source-spec.yaml, website-analytics.measure.source-spec.yaml, and code-packaging.measure.source-spec.yaml files.

A proposed parser discovery solution:

    • [ ] specs.find_specs() looks for more parsers (subclasses of BaseParser) in the parsers directory of the plugin
    • [ ] instances of discovered plugin-supplied parsers are prepended to specs.SPEC_PARSERS (so they take precedence over native parsers).
    • [ ] specs.find_specs() carries on as normal

What do you think, @akariv?

brew avatar Apr 11 '17 08:04 brew

Although I'm not opposed to this idea at all, I think that for this use case it might not be needed.

I'm thinking of a single measure.source-spec.yaml file, with sections, such as:

project-name: my project
configuration:
  social-media:
    facebook: <token>
    twitter: <token>
  analytics:
    ga: <token>
  code-packaging:
    ...

Then, the generator would generate a few pipelines, named:

<project-name>-social-media
<project-name>-web-analytics
<project-name>-code-packaging
...

Each one with the correct processors and based on the provided configuration. I think this is a little better as this way, one file holds all settings for a single project.

wdyt?

akariv avatar Apr 11 '17 14:04 akariv

Ah, okay. So I yield more than one pipeline spec from the generator? That makes sense. One downside is the Generator class might become a bit monolithic.

brew avatar Apr 11 '17 15:04 brew

Yes - but we can try and make it modular in the plugin itself (e.g. use other classes to do the actual work and leave the generator as a wrapper)

akariv avatar Apr 11 '17 18:04 akariv