astro-sdk icon indicating copy to clipboard operation
astro-sdk copied to clipboard

Databricks workflow support for `aql.dataframe functions`

Open dimberman opened this issue 2 years ago • 0 comments

Please describe the feature you'd like to see

Astro SDK users should be able to run their aql.dataframe functions in a databricks workflow task group.

@aql.dataframe()
def foo(...)
    ....
   
@aql.transform()
def bar(...)
    .....
with dag:
    with DatabricksWorkflowTaskGroup() as tg:
        f = foo()
        b = bar()

Adding this function should mean that the python script submitted to databricks is run in a databricks workflow, giving them access to databricks Job Clusters.

Describe the solution you'd like The first few steps can be the same as https://github.com/astronomer/astro-sdk/issues/1822, where we generate a python file and load it to DBFS, the only major difference will be that instead of launching the task, we add it to the databricks by adding a convert_to_databricks_workflow_task function, as well as the necessary functions to monitor the task remotely similar to how we handle in cosmos (we can potentially even create a shared base class for these functions).

Are there any alternatives to this feature? The alternative is to put python code in a databrikcs notebook and use the cosmos DatabricksNotebookOperator

Additional context Add any other context about the feature request here.

Acceptance Criteria

  • [ ] All checks and tests in the CI should pass
  • [ ] Unit tests (90% code coverage or more, once available)
  • [ ] Integration tests (if the feature relates to a new database or external service)
  • [ ] Example DAG
  • [ ] Docstrings in reStructuredText for each of methods, classes, functions and module-level attributes (including Example DAG on how it should be used)
  • [ ] Exception handling in case of errors
  • [ ] Logging (are we exposing useful information to the user? e.g. source and destination)
  • [ ] Improve the documentation (README, Sphinx, and any other relevant)
  • [ ] How to use Guide for the feature (example)

dimberman avatar Mar 03 '23 02:03 dimberman