hayhooks icon indicating copy to clipboard operation
hayhooks copied to clipboard

Pipeline tracing and Hayhooks cannot work at the same time

Open ArnaudWald opened this issue 1 month ago • 3 comments

Hello, I am trying to setup Haystack pipelines with hayhooks and Arize Phoenix tracing enabled.

After following the docs for setup and wrapping the pipelines, they run fine when I start them without tracing.

However, having both execution and traceability would be a really good feature for my usecase. I followed the Phoenix instrumentation setup and I can trace pipelines when ran manually (without hayhooks).

When I try to do both at the same time, I get hit with ImportErrors and path issues that prevent me from doing that.


I'll try to make a reproducible example below:

In one terminal, I run phoenix serve

In another i have hayhooks run with HAYHOOKS_SHOW_TRACEBACKS=true

And my hello world pipeline_wrapper.py looks like this

from hayhooks import BasePipelineWrapper
from haystack import Pipeline, component
from openinference.instrumentation.haystack import HaystackInstrumentor
from phoenix.otel import register

tracer_provider = register(
    project_name="test_project",
    auto_instrument=True,
    endpoint="http://localhost:4317",
)

HaystackInstrumentor().instrument(tracer_provider=tracer_provider)


@component
class Hello:
    @component.output_types(output=str)
    def run(self, word: str):
        return {"output": f"Hello, {word}!"}


class PipelineWrapper(BasePipelineWrapper):
    def setup(self) -> None:
        # Create components
        greeter = Hello()
        # Build pipeline
        self.pipeline = Pipeline()
        self.pipeline.add_component("greeter", greeter)

    def run_api(self, input_text: str) -> str:
        result = self.pipeline.run({"word": input_text})
        return result["greeter"]["output"]

Deploy it like this

hayhooks pipeline deploy-files -n hello hello_world

Query it like this:

curl -X 'POST' \
  'http://localhost:1416/hello/run' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "input_text": "string"
}'

And the error:

Pipeline execution error: No module named 'hello' - Traceback (most recent call last):
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/hayhooks/server/utils/deploy_utils.py", line 186, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/hayhooks/server/utils/deploy_utils.py", line 242, in run_endpoint_without_files
    result = await run_in_threadpool(pipeline_wrapper.run_api, **run_req.model_dump())  # type:ignore[attr-defined]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2485, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 976, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/pipelines/hello/pipeline_wrapper.py", line 31, in run_api
    result = self.pipeline.run({"word": input_text})
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/openinference/instrumentation/haystack/_wrappers.py", line 214, in __call__
    response = wrapped(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/haystack/core/pipeline/pipeline.py", line 385, in run
    component_outputs = self._run_component(
                        ^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/openinference/instrumentation/haystack/_wrappers.py", line 81, in __call__
    self._wrap_component_run_method(component_cls, component_instance.run)
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/openinference/instrumentation/haystack/__init__.py", line 105, in wrap_component_run_method
    wrap_function_wrapper(
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/wrapt/patches.py", line 114, in wrap_function_wrapper
    return wrap_object(module, name, FunctionWrapper, (wrapper,))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/wrapt/patches.py", line 60, in wrap_object
    (parent, attribute, original) = resolve_path(module, name)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/awald/work_space/.venv/lib/python3.12/site-packages/wrapt/patches.py", line 17, in resolve_path
    __import__(module)
ModuleNotFoundError: No module named 'hello'

Like I said above, when you comment out the Instrument lines and redeploy the pipeline (and restart the server) it works fine.

Looks like it comes from some wrappers around the functions, but I'm not familiar enough with the codebase to investigate further.

Any idea on how to fix this ?

ArnaudWald avatar Nov 03 '25 10:11 ArnaudWald

Hi @ArnaudWald!

I had a look at how Arize Phoenix Haystack instrumentor works. Basically, it seems it expects the loaded pipeline wrapper to be an actual module and decorates exposes objects for tracing. In Hayhooks' case, pipelinewrapper is loaded as a module but we don't actually register internally it as a package.

Short-term solution

Create an app.py and run Hayhooks programmatically. This way you'll have a single global instance of tracing objects:

import uvicorn
from hayhooks.settings import settings
from hayhooks import create_app
from openinference.instrumentation.haystack import HaystackInstrumentor
from phoenix.otel import register

tracer_provider = register(
    project_name="test_project",
    auto_instrument=True,
    endpoint="http://localhost:4317",
)

HaystackInstrumentor().instrument(tracer_provider=tracer_provider)

hayhooks = create_app()

if __name__ == "__main__":
    uvicorn.run("app:hayhooks", host=settings.host, port=settings.port)

Run it simply with python app.py in you venv.

In your pipeline wrapper, you need to programmatically create a package and expose the Hello component:

import sys
from types import ModuleType

from hayhooks import BasePipelineWrapper
from haystack import Pipeline, component


@component
class Hello:
    @component.output_types(output=str)
    def run(self, word: str):
        return {"output": f"Hello, {word}!"}


# Make the component discoverable for instrumentation monkey patching
# NOTE: this is a short-term solution!
_package_name = __name__.split(".")[0]
package = sys.modules.get(_package_name)
if package is None:
    package = ModuleType(_package_name)
    package.__path__ = []
    sys.modules[_package_name] = package
setattr(package, "Hello", Hello)


class PipelineWrapper(BasePipelineWrapper):
    def setup(self) -> None:
        # Create components
        greeter = Hello()
        # Build pipeline
        self.pipeline = Pipeline()
        self.pipeline.add_component("greeter", greeter)

    def run_api(self, input_text: str) -> str:
        result = self.pipeline.run({"word": input_text})
        return result["greeter"]["output"]

Then deploy it and it will work.

Image

Long-term solution

I would like to do a bit of research to check if other tracers tend to expect pipeline wrappers as packages - if yes, we will add a fix to Hayhooks! After that, I still recommend to use Hayhooks programmatically in your case, otherwise if you have multiple pipeline wrappers you'll end up having also multiple (global) tracer instances.

mpangrazzi avatar Nov 07 '25 16:11 mpangrazzi

Hey, thank you so much for the fix, i was able to run it successfully 💪 this will work fine until there's a long-term solution.

ArnaudWald avatar Nov 18 '25 13:11 ArnaudWald

@ArnaudWald awesome! We will add a fix soon!

mpangrazzi avatar Nov 18 '25 14:11 mpangrazzi