`UX` Hamilton Project

Open zilto opened this issue 9 months ago • 0 comments

Current Limitations

When consulting a Python project using Hamilton, there is no way to tell which files are "Hamilton modules".

This has several implications:

User doesn't know what can be imported and passed to a Driver
User might unknowingly add functions to a module, rendering it invalid for Hamilton
Project and IDE tooling for Hamilton don't have a standardized / centralized way to identify Hamilton modules
User / tools can't know which combinations of modules can be passed together to a Driver

I touched on a similar topic in Issue #747 in the context of the CLI.

Benefits

I proposed the notion of Project (to map to Hamilton UI "project"; maybe "workspace" is better) to allow users to specify "Hamilton modules".

Features it could unlock:

LSP: multi-module features

code navigation. You're currently editing hello.py, but the LSP builds the dataflow with both hello.py and world.py and knows about their nodes.
visualization. Allow to view multiple modules in the VSCode extension instead of only current file

CLI / pre-commit / CI: apply to all

validate all modules. The pre-commit can attempt to build all "single" and "composed" dataflows
generate all visualizations. Use the CLI to generate visualizations of all modules on command or commit

Hamilton UI

sync catalog without execution. The UI could better separate "historical dataflows" that were executed from "available dataflows" representing the state of the current code

API design

Hamilton is designed around 2 layers: dataflow definition and dataflow execution. This API relates to dataflow definition, which requires knowing:

required: Python modules (file paths; one or more)
optional: Driver config (dict)

Given Hamilton is Python-centric, it should adopt pyproject.toml as a standard. The TOML format is also well-supported by other languages for parsing (e.g., TypeScript in VSCode extension, future Rust dev tools). The format supports the relevant types to specify the Python modules and config.

Example TOML; it provides flexibility for specifying dataflow definition

# shortform notation
[tool.hamilton]
dataflows = [
  { name = "greetings", modules = ["world.py"] },
  { modules = ["hello.py"] },  # `name` is inferred when `len(modules) == 1`
]

# longform notation
# mutually exclusive with shortform because they both use `tool.hamilton.dataflows`

[[tool.hamilton.dataflows]]  # this adds to the list `hamilton.dataflows`
modules = ["single.py"]  # `name` is inferred when `len(modules) == 1`

[[tool.hamilton.dataflows]]
name = "composed"
modules = ["a.py", "b.py"]  # list `hamilton.dataflows[i].modules[...]`

[[tool.hamilton.dataflows]]
name = "inline_config"
modules = ["a.py"]
config = { env = "dev", owner = "me" }  # mapping `hamilton.dataflows[i].config{...}`

[[tool.hamilton.dataflows]]
name = "multiline_config"
modules = ["a.py"]
config.env = "dev"  # key-value pair `hamilton.dataflows[i].config{env: "dev"}`
config.owner = "me"
config.key1 = true
config.key2 = false
config.key3 = 12345

API extensibility

Currently, we only define tool.hamilton.dataflows, but we can add more configurations.

May 06 '24 15:05 zilto

hamilton hamilton copied to clipboard

`UX` Hamilton Project

Current Limitations

Benefits

LSP: multi-module features

CLI / pre-commit / CI: apply to all

Hamilton UI

API design

API extensibility

hamilton
hamilton copied to clipboard