dbx
dbx copied to clipboard
better-args-parser
Proposed changes
p.parse_known_args()[0] is enough instead of p.parse_known_args(sys.argv[1:])[0]
Types of changes
What types of changes does your code introduce to dbx?
Put an x in the boxes that apply
- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Documentation Update (if none of the other choices apply)
Further comments
with some minor imports sorting and typo fix in the same files
Codecov Report
Merging #477 (61547ad) into main (2ad685f) will not change coverage. The diff coverage is
n/a.
:exclamation: Current head 61547ad differs from pull request most recent head a611a06. Consider uploading reports for the commit a611a06 to get more accurate results
@@ Coverage Diff @@
## main #477 +/- ##
=======================================
Coverage 91.28% 91.28%
=======================================
Files 55 55
Lines 2594 2594
Branches 374 374
=======================================
Hits 2368 2368
Misses 159 159
Partials 67 67
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
@renardeinside, please review
@renardeinside what do you think about this PR ?
I think there should be a bigger PR to be honest, and it shouldn't be based only on the arguments change.
If we would like to do something like this, we'll need to cover 2 aspects:
- Easy and simple way to pass multiple arguments. For instance typer is a great framework but might be an overkill in terms of functionality.
- Convenient and typed way to parse the task configurations file (something like
omegaconf+pydantic?)
Therefore with this PR we could do two things:
- Close it and rework into more profound feature request and make some better design decisions
- Apply it to deliver the value faster (not sure how much value this change brings, but don't want to discourage your contribution).
Hope this explains my PoV. What would be your PoV on the above said @copdips ?
Hi @renardeinside, thanks for your in-depth reply.
To build a CLI application like dbx itself, I've the same thoughts as yours, typer is a perfect choice.
I've never used omegaconf before, just took a quick overview, it seems that there's OmegaConf.from_cli() which looks great.
Maybe the argument parsing is a little bit out of scope of dbx which is not a deployment stuff, but if you would like to introduce new dependencies for user's databricks tasks (not dbx itself), yes, I should than close this PR.
I've discovered various options and I think we could use the hydra application framework in our Python templates - it provides various useful interfaces, looks fancy and has good documentation. Would that be something that you agree with @copdips ?
Sorry for the late about this thread. After reading the doc of hydra, it looks like a good choice for config management.
What I think is how to let dbx to use hydra config, as the entrypoint is a python based file, so dbx deploy --deployment-file should handle python file type too, and probably also with a new argument --deployment-file-arguments.
Furthermore it would be nice if can we use the python deployment file in a more generalized way out of hydra, which means whatever it's hydra or not, as long as --deployment-file is a python file, just let dbx to run it, and the given python file should print the deployment in stdout that dbx can capture. By this way, we can fully leverage the python power to generate complex/redundant deployment definition.
I also was thinking in the same direction - what if we allow users to build a configuration in the same way as it's done in the FastAPI for instance.
Brief design outline:
# e.g. .dbx/configurator.py
from dbx import configurator
from dbx import Workflow, Task
wf1 = Workflow(...)
wf2 = Workflow(...)
wf1.add(Task(...))
wf1.add(Task(...))
wf2.add(Task(...))
configurator.add_workflow(w1, w2)
And then in the dbx deploy (or launch or whatever):
dbx deploy --deployment-file=.dbx/configurator.py
i think this change requires a bigger PR and proper design phase. Please check this issue and share your thoughts - https://github.com/databrickslabs/dbx/issues/547
This one I'll close for the moment since it won't be implemented in the way it's provided.