argo-python-dsl icon indicating copy to clipboard operation
argo-python-dsl copied to clipboard

Functional API?

Open ecurtin2 opened this issue 5 years ago • 3 comments

Hey there!

Love the project. I use Argo every day at work and a nice Python API would make many things a lot easier. I'm wondering if you have plans for a more functional type of API. For example, something like Prefect

If you're open, I would be willing write up a PoC and contribute it.

ecurtin2 avatar Jan 27 '20 23:01 ecurtin2

Hello @ecurtin2 !

Thanks for the feedback and a suggestion. I think you're probably thinking of something like Apache Airflow, am I right?

The initial purpose of this project was to be able to clearly and visually define Argo Workflows. However, a functional API might definitely be useful and I can see the use cases. The problem I expect to come up with the functional API is scalability and adjustability to knew models or Argo API changes. Argo is still quite young and is changing quickly.

Nevertheless, we could do some baby steps, prototype a few basic functions and see how community reacts to it.


This repository is expected to be migrated under argoproj-labs at some point in the future (see #1 for more info). Contributors will be more than welcome :)

CC @alexec

CermakM avatar Jan 28 '20 10:01 CermakM

I'm still playing around in my head with the implementation, but I would like to try to make it possible for (at least some) workflows to be written in a way that's closer to how you'd write normal python. For example, I'd like to be able to write an ETL pipeline

extract = Template(image=..., command=..., params="Input Data")
transform = Template(image=..., command=...)
load = Template(image=..., command=...)

with SparkCluster():  # OnExit in __exit__
    data = extract("s3://stuff")
    transformed = transform(data)    # data.artifactName??
    result = load(transformed, to="s3://transformed-stuff")

# not sure about this part
workflow = Workflow(result)

Here is an example from Prefect's documentation of their functional API

from prefect import Flow

with Flow('ETL') as flow:
    e = extract()
    t = transform(e)
    l = load(t)

flow.run() # prints "Here's your data: [10, 20, 30]"

I imagine it would be possible to do something similar for an argo workflow

I'll be travelling for a bit so won't be able to look at this for a few weeks but will try some experiments out

ecurtin2 avatar Feb 02 '20 16:02 ecurtin2

@ecurtin2 if you are still interested in working on this, we can discuss here https://github.com/argoproj-labs/argo-python-dsl/

binarycrayon avatar Oct 11 '20 20:10 binarycrayon