Mutate (and likely steps too) code isn't shown in Hamilton UI
Current behavior
Running mutate decorator (and likely step/pipe) with the Hamilton Tracker shows the correct DAG structure, but no code is attached.
Screenshots
Steps to replicate behavior
from hamilton.function_modifiers import mutate
import pandas as pd
def transformed_data(raw_data: pd.DataFrame) -> pd.DataFrame:
return ... # do your regular stuff here
# turns initial_data into initial_data + 1
@mutate(transformed_data)
def _normalize_columns(df: pd.DataFrame) -> pd.DataFrame:
for column in df.columns:
df[column] = (df[column]-df[column].min())/(df[column].max() - df[column].min())
return df
@mutate(transformed_data, outlier_threshold=10)
def _remove_outliers(df: pd.DataFrame, outlier_threshold: float) -> pd.DataFrame:
return df[df < outlier_threshold]
driver:
from hamilton_sdk import adapters
from hamilton import driver
tracker = adapters.HamiltonTracker(
project_id=..., # modify this as needed
username="...",
dag_name="mutate_example",
tags={"environment": "DEV", "team": "MY_TEAM", "version": "mutate"},
)
dr = (
driver.Builder()
.with_config({})
.with_modules(mutate_example)
.with_adapters(tracker)
.build()
)
Library & System Information
Latest python, SDK, & hamilton UI versions
Expected behavior
That the code shows up.
Additional context
Guess - we're not attaching the source code appropriately or referencing it correctly for the UI to show it.
The problem here is the originating functions... @mutate isn't attached to the function and we don't collect the references. We may want to:
- Add to originating functions
- Create a
auxiliary_functionsvariable to store the attached ones
I just checked, pipe_output, pipe_input / step has the same behavior (not surprised)!
The problem here is the originating functions...
@mutateisn't attached to the function and we don't collect the references. We may want to:
- Add to originating functions
- Create a
auxiliary_functionsvariable to store the attached ones
After some thought I agree with 2., because we can use the same function in multiple "pipes" or "mutates", but it makes sense to really show the code only once.
The problem here is the originating functions...
@mutateisn't attached to the function and we don't collect the references. We may want to:
- Add to originating functions
- Create a
auxiliary_functionsvariable to store the attached onesAfter some thought I agree with 2., because we can use the same function in multiple "pipes" or "mutates", but it makes sense to really show the code only once.
The added benefit of this is that we can have some notion of "helper" functions -- store more than just the specific functions that we use. This could enable us to crawl the modules for functions using static analysis, etc... To make the UI more valuable.