pyinfra Recursive callback doesn't run until I manually quit pyinfra with Ctrl-C

Describe the bug

I have a recursive callback: https://github.com/geerlingguy/sbc-reviews/blob/master/benchmark/tasks/ollama-benchmark.py#L55

This callback is called in a loop for a number of operations, and each time it is called, it should run the operation, report the result, then the next iteration should run.

Right now, with the latest devel version of pyinfra (installed from latest git commit), when I run my Pyinfra scripts, it hangs on the task:

--> Starting operation: Execute task loop 
    Starting nested operation: tasks/ollama-benchmark.py | Download Ollama Install Script 
    [10.0.2.212] nested No changes
    Starting nested operation: tasks/ollama-benchmark.py | Clone ollama-benchmark with git. 
    [10.0.2.212] nested Success
    Starting nested operation: tasks/ollama-benchmark.py | Execute Ollama loop 
    Starting nested operation: tasks/ollama-benchmark.py | Download Ollama model: llama3.2:3b 
    [10.0.2.212] nested Success
    Starting nested operation: tasks/ollama-benchmark.py | Benchmark Ollama model: llama3.2:3b 
[hangs here...]

When I press Ctrl-C to stop the script, the operation is actually run on the device being managed, but of course the results are not then reported back—and the rest of the loop is not run, since I've bailed out on the process.

To Reproduce

See above—note that you had set up a reproducer for my infinite callback recursion with this same callback earlier: https://github.com/pyinfra-dev/pyinfra/issues/1283

The same code had worked previously when manually applying your fix to an older 3.2 install, but installing from git source seems to not work. Haven't done a bisect yet to see if there's a commit where things go wrong.

Expected behavior

I expect the callback to loop and run all jobs.

Meta

$ pyinfra --support

    If you are having issues with pyinfra or wish to make feature requests, please
    check out the GitHub issues at https://github.com/Fizzadar/pyinfra/issues .
    When adding an issue, be sure to include the following:

    System: Darwin
      Platform: macOS-15.4.1-arm64-arm-64bit-Mach-O
      Release: 24.4.0
      Machine: arm64
    pyinfra: v3.2
      black: v25.1.0
      black: v25.1.0
      click: v8.1.8
      distro: v1.9.0
      gevent: v24.11.1
      importlib_metadata: v8.6.1
      jinja2: v3.1.6
      packaging: v24.2
      paramiko: v3.5.1
      python-dateutil: v2.9.0.post0
      pywinrm: v0.5.0
      pyyaml: v6.0.2
      pyyaml: v6.0.2
      setuptools: v76.0.0
      typeguard: v4.4.2
      typing-extensions: v4.13.2
      wheel: v0.45.1
    Executable: /opt/homebrew/bin/pyinfra
    Python: 3.13.3 (CPython, Clang 16.0.0 (clang-1600.0.26.6))

Installed via pip. Debug output where it starts to hang:

...
    Starting nested operation: tasks/ollama-benchmark.py | Execute Ollama loop 
    Starting nested operation: tasks/ollama-benchmark.py | Download Ollama model: llama3.2:3b 
[10.0.2.212] nested >>> sh -c 'ollama pull llama3.2:3b'
    [10.0.2.212] nested Success
    Starting nested operation: tasks/ollama-benchmark.py | Benchmark Ollama model: llama3.2:3b 
[10.0.2.212] nested >>> sh -c '/home/jgeerling/Downloads/ollama-benchmark/obench.sh -m llama3.2:3b -c 3 --markdown'
    | {10.0.2.212}

May 06 '25 18:05 geerlingguy

I might be missing something, but why use a python.call? Couldn't you just execute it directly?

May 07 '25 16:05 wowi42

Weird, I don't think anythings changed around operation handling (nested or otherwise). Will have to setup a repro and bisect to find this one!

Jun 08 '25 12:06 Fizzadar

Still running into this, unfortunately haven't had time to go deeper and bisect anything.

@wowi42 - I am using the callback so I can get output into the console as it runs through the operations. See: https://docs.pyinfra.com/en/2.x/using-operations.html#operation-output

However... if there's a more robust way to avoid this recursive hole, I'd be okay implementing that (even if it means I have to wait a while for output, it's just more convenient to try to get the output as it completes a call).

Jul 28 '25 17:07 geerlingguy

Just wanted to provide another bit of data for reference: I'm switching from running a script in my dowloaded repository through the loop, to running commands directly.

The nested callback to the command runs fine, it seems:

if host.data.ai_benchmark == 'llama.cpp':
    ...
    git.repo(
        name="Clone llama.cpp with git.",
        src="https://github.com/ggerganov/llama.cpp.git",
        dest="{}/llama.cpp".format(working_dir),
    )
    ...
    # CALLBACK STARTS HERE -----------------------------------------------------
    llama_bench_opts=host.data.llama_bench_opts
    def llama_cpp_loop_callback():
        for model, model_details in host.data.llama_cpp_models.items():
            files.download(
                name="Downloading model: {}".format(model),
                src=model_details['url'],
                dest="{}/llama.cpp/models/{}".format(working_dir, model),
            )

            llama_bench_result = server.shell(
                name="Run llama-bench",
                commands="cd {}/llama.cpp && ./build/bin/llama-bench -m models/{} {}".format(working_dir, model, llama_bench_opts),
            )

            logger.info(f"\n{llama_bench_result.stdout}\n")

    python.call(
        name="Execute llama.cpp loop",
        function=llama_cpp_loop_callback,
    )

But the nested callback which runs the shell script causes the hang:

    ...
    git.repo(
        name="Clone ai-benchmarks with git.",
        src="https://github.com/geerlingguy/ai-benchmarks.git",
        dest="{}/ai-benchmarks".format(working_dir),
    )
    ...
    # CALLBACK STARTS HERE -----------------------------------------------------
    def ollama_loop_callback():
        for model, model_size in ollama_models.items():
            server.shell(
                name="Download Ollama model: {}".format(model),
                commands="ollama pull {}".format(model),
            )

            ollama_benchmark_result = server.shell(
                name="Benchmark Ollama model: {}".format(model),
                commands="{}/ai-benchmarks/obench.sh -m {} -c 3 --markdown".format(working_dir, model),
            )

            logger.info(f"\n{ollama_benchmark_result.stdout}\n\n")

    python.call(
        name="Execute Ollama loop",
        function=ollama_loop_callback,
    )

Running the latter results in the hanging behavior I described originally. I will be pushing my changes to the raw pyinfra file here soon: https://github.com/geerlingguy/sbc-reviews/blob/master/benchmark/tasks/ai-benchmark.py

And here's the obench.sh script I'm running in that 2nd bit of code: https://github.com/geerlingguy/ai-benchmarks/blob/main/obench.sh

Nov 22 '25 04:11 geerlingguy

Apologies for not trying this out sooner, second child arrived not long after and I totally lost track of everything.

This is fascinating - particularly because it's just a server.shell call in each case - that shouldn't behave any differently. Trying it locally within a Docker container (where ollama isn't actually running) works fine, each iteration of the script immediately errors as I'd expect.

..Finally cracked it! ollama run is waiting on stdin to be closed, which pyinfra does not do, so it just hangs indefinitely. Reproduced by just running ollama serve in a container and re-running the deploy against it over and over, which allowed me to pin it down to:

# Hangs
uv tool run pyinfra @docker/b733405e25b6 server.shell "ollama run llama3.2:3b --format json --verbose 'Why is the blue sky blue?'" -y

# Works fine
docker exec -it  b733405e25b6 ollama run llama3.2:3b --verbose 'Why is the blue sky blue?'

With https://github.com/pyinfra-dev/pyinfra/pull/1506, the problem is solved. My only slight concern is whether this will negatively impact any other programs. I've never encountered any CLI that behaves like this before!

Dec 02 '25 10:12 Fizzadar

@Fizzadar if it's any consolation, Ollama is a bit strange in more ways than that :D

But definitely an odd edge case.

For now I've actually been reworking my tests using llama.cpp mostly because it's more broadly compatible with the weird hardware combinations I test :D

Dec 02 '25 16:12 geerlingguy

Oh also, congrats on your second child! Hope things have settled into something resembling sanity. Remember, you can sleep again when you're retired haha!

Dec 02 '25 16:12 geerlingguy