hamilton icon indicating copy to clipboard operation
hamilton copied to clipboard

RayGraphAdapter incorrect telemetry output to Hamilton UI

Open jernejfrank opened this issue 1 year ago • 3 comments

HamiltonTracker logs wrong execution telemetry if run with RayGraphAdapter.

Current behaviour

  1. Individual node execution tracks as immediate.
  2. During Error the run does not display failure / shows all nodes executed correctly.

Steps to replicate behavior

import pandas as pd
import time


def node_5s()->float:
    start = time.time()
    time.sleep(5)
    return time.time() - start

def node_5s_error()->float:
    start = time.time()
    time.sleep(5)
    raise ValueError("Does not break telemetry if executed through ray")
    return time.time() - start

if __name__ == "__main__":
    import __main__
    from hamilton import base, driver
    from hamilton.plugins.h_ray import RayGraphAdapter
    from hamilton_sdk import adapters
    import ray

    username = 'jernejfrank'

    tracker_ray = adapters.HamiltonTracker(
        project_id=1,  # modify this as needed
        username=username,
        dag_name="ray_telemetry_bug",
        )
    
    try:
        ray.init()
        rga = RayGraphAdapter(result_builder=base.PandasDataFrameResult())
        dr_ray = ( driver.Builder()
            .with_modules(__main__)
            .with_adapters(rga,tracker_ray)
            .build()
            )
        result_ray = dr_ray.execute(final_vars=['node_5s','node_5s_error'])
        print(result_ray)
        ray.shutdown()
    except ValueError:
        print("UI displays no problem")
    finally:
        tracker = adapters.HamiltonTracker(
            project_id=1,  # modify this as needed
            username=username,
            dag_name="telemetry_okay",
            )
        dr_without_ray = ( driver.Builder()
            .with_modules(__main__)
            .with_adapters(tracker)
            .build()
            )
        
        result_without_ray = dr_without_ray.execute(final_vars=['node_5s','node_5s_error'])

Library & System Information

  • sf-hamilton:main
  • ray 2.34.0
  • Python 3.9 and 3.10
  • Linux Ubuntu 22.04
  • MacOS Ventura 13.6.7

Expected behavior

Same as without RayGraphAdapter

Additional context

Happy to work on that, but I will need support.

jernejfrank avatar Aug 04 '24 19:08 jernejfrank

@jernejfrank thanks for the issue. Currently this is expected behavior for use with the RayGraphAdapter (the HamiltonTracker works when using Parallel / Collect + RayTaskExecutor).

So we have all the tools, it's just the wiring that needs to be set up. There's a few approaches we could take. Let @elijahbenizzy & myself sketch some options out and get back to you. I assume this is a bit of a blocker for you to use the UI then?

skrawcz avatar Aug 04 '24 21:08 skrawcz

@skrawcz ah, okay - didn't know about RayTaskExecutor and haven't had the time/need to touch the parallel module.

Our pipelines are pretty split up and only one of them uses Ray so far. Not complete blocker since the UI will be handy on the other pipelines, but definitely looking to have that up an running with Ray as well (since we are planning to upgrade other pipelines to Ray).

Cool, let me know when you have something in mind. Like I said, happy to contribute.

jernejfrank avatar Aug 04 '24 23:08 jernejfrank

@jernejfrank yeah so we think we have a path forward:

  1. We need to add a new lifecycle API method that is something like do_remote_execute.
  2. This will then create a wrapper function that will pass through adapters to be used around the function to be executed, e.g. enabling the pass through of the HamiltonTracker to execute along with the function that is being remotely executed.
  3. We'd then need to mess with a few of the internals a little to make this work.

So plan is to sketch this out in a PR - depending on the details will tag you for something to contribute to where it makes sense.

skrawcz avatar Aug 09 '24 16:08 skrawcz

Hey @jernejfrank -- we outlined in detail what would be involved -- it's nothing too complicated but you get a bit of a tour of Hamilton's inner workings. Feel free to reach out if you want to pair to get started.

https://github.com/DAGWorks-Inc/hamilton/pull/1097/files

elijahbenizzy avatar Aug 15 '24 21:08 elijahbenizzy

Hi @elijahbenizzy , awesome! Super excited to look under the hood.

Let me poke around a bit over the weekend and maybe we can meet Monday or Tuesday to make it more productive? I'm based in the UK and available after 2pm GMT, let me know if it fits in your schedule.

jernejfrank avatar Aug 16 '24 16:08 jernejfrank

Hi @elijahbenizzy , awesome! Super excited to look under the hood.

Let me poke around a bit over the weekend and maybe we can meet Monday or Tuesday to make it more productive? I'm based in the UK and available after 2pm GMT, let me know if it fits in your schedule.

Sounds good! Will reach out on slack to find a time.

elijahbenizzy avatar Aug 16 '24 17:08 elijahbenizzy