hamilton icon indicating copy to clipboard operation
hamilton copied to clipboard

Mutate (and likely steps too) code isn't shown in Hamilton UI

Open skrawcz opened this issue 1 year ago • 4 comments

Current behavior

Running mutate decorator (and likely step/pipe) with the Hamilton Tracker shows the correct DAG structure, but no code is attached.

Screenshots

Screen Shot 2024-10-11 at 4 18 55 PM

Steps to replicate behavior

from hamilton.function_modifiers import mutate
import pandas as pd


def transformed_data(raw_data: pd.DataFrame) -> pd.DataFrame:
    return ... # do your regular stuff here

# turns initial_data into initial_data + 1
@mutate(transformed_data)
def _normalize_columns(df: pd.DataFrame) -> pd.DataFrame:
    for column in df.columns:
        df[column] = (df[column]-df[column].min())/(df[column].max() - df[column].min())
    return df

@mutate(transformed_data, outlier_threshold=10)
def _remove_outliers(df: pd.DataFrame, outlier_threshold: float) -> pd.DataFrame:
    return df[df < outlier_threshold]

driver:

from hamilton_sdk import adapters
from hamilton import driver

tracker = adapters.HamiltonTracker(
   project_id=...,  # modify this as needed
   username="...",
   dag_name="mutate_example",
   tags={"environment": "DEV", "team": "MY_TEAM", "version": "mutate"},

)
dr = (
  driver.Builder()
    .with_config({})
    .with_modules(mutate_example)
    .with_adapters(tracker)
    .build()
)

Library & System Information

Latest python, SDK, & hamilton UI versions

Expected behavior

That the code shows up.

Additional context

Guess - we're not attaching the source code appropriately or referencing it correctly for the UI to show it.

skrawcz avatar Oct 11 '24 23:10 skrawcz

The problem here is the originating functions... @mutate isn't attached to the function and we don't collect the references. We may want to:

  1. Add to originating functions
  2. Create a auxiliary_functions variable to store the attached ones

elijahbenizzy avatar Oct 12 '24 00:10 elijahbenizzy

Screenshot 2024-10-12 at 11 12 33

I just checked, pipe_output, pipe_input / step has the same behavior (not surprised)!

jernejfrank avatar Oct 12 '24 10:10 jernejfrank

The problem here is the originating functions... @mutate isn't attached to the function and we don't collect the references. We may want to:

  1. Add to originating functions
  2. Create a auxiliary_functions variable to store the attached ones

After some thought I agree with 2., because we can use the same function in multiple "pipes" or "mutates", but it makes sense to really show the code only once.

jernejfrank avatar Oct 12 '24 12:10 jernejfrank

The problem here is the originating functions... @mutate isn't attached to the function and we don't collect the references. We may want to:

  1. Add to originating functions
  2. Create a auxiliary_functions variable to store the attached ones

After some thought I agree with 2., because we can use the same function in multiple "pipes" or "mutates", but it makes sense to really show the code only once.

The added benefit of this is that we can have some notion of "helper" functions -- store more than just the specific functions that we use. This could enable us to crawl the modules for functions using static analysis, etc... To make the UI more valuable.

elijahbenizzy avatar Oct 12 '24 17:10 elijahbenizzy