luigi
luigi copied to clipboard
Cannot set pdb breakpoint in `run()` method
Steps to reproduce
- Create a python file named
mytask.py
:import luigi class MyTestTask(luigi.Task): def run(self): import pdb; pdb.set_trace()
- Run it:
luigid --background & luigi --module mytest MyTestTask
Expected behavior
Luigi runs the task as usual and PDB debugger opens on the shell.
Actual behavior
The debugger cannot start up due to a BdbQuit
that is raised.
Extract from the log:
INFO: [pid 117] Worker Worker(salt=650688562, workers=1, host=5c70eb527280, username=root, pid=114) running MyTestTask()
--Return--
> /app/src/mytest.py(5)run()->None
-> import pdb; pdb.set_trace()
(Pdb)
ERROR: [pid 117] Worker Worker(salt=650688562, workers=1, host=5c70eb527280, username=root, pid=114) failed MyTestTask()
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/luigi/worker.py", line 199, in run
new_deps = self._run_get_new_deps()
File "/usr/local/lib/python3.6/dist-packages/luigi/worker.py", line 141, in _run_get_new_deps
task_gen = self.task.run()
File "/app/src/mytest.py", line 5, in run
import pdb; pdb.set_trace()
File "/usr/lib/python3.6/bdb.py", line 55, in trace_dispatch
return self.dispatch_return(frame, arg)
File "/usr/lib/python3.6/bdb.py", line 99, in dispatch_return
if self.quitting: raise BdbQuit
bdb.BdbQuit
INFO: Informed scheduler that task MyTestTask__99914b932b has status FAILED
Remarks
Please note that we were already able to debug luigi tasks in the past before we updated our luigi version.
Related StackOverflow questions:
That seems to indicate that the run()
is called from a worker process that is not attached to your terminal.
Could you indicate the Luigi version that you had before and worked?
Could you indicate the Luigi version that you had before and worked?
Not in detail, unfortunately, but it must have been installed in the fourth quarter of 2019.
Does using --local-scheduler work for you?
luigi --module mytest MyTestTask --local-scheduler
@hazbottles No, same error :(
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If closed, you may revisit when your time allows and reopen! Thank you for your contributions.
I still would like to see a fix for this bug.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If closed, you may revisit when your time allows and reopen! Thank you for your contributions.
Having the same error here. Would appreciate some help
As @NewbiZ said, the error message indicates that the luigi process is not attached to a terminal. Luigi can cause this if --workers
is more than one, or if force_multiprocessing
is set in the config.
You can also get the same error message if something else severs the connection to the terminal, for example if you pipe the output from Luigi to another program. Or possibly if some code in your workflows closes sys.stdout. It is unlikely to be a bug in Luigi, but you never know. It is probably caused by something in your environment.
If I had the issue, I would use the strace
program to figure out how Luigi is started, and if stdout was closed at some point. The output of strace
can be challenging to interpret, however. Feel free to post a dump and we'll see if we can help. You will need to use strace -f
to capture subprocesses as well.
You could also try to create a minimal, isolated, and self-contained way to trigger the problem. E.g. a Dockerfile that can reproduce the scenario in a container. The example above works fine on my machine.
I meant sys.stdin, not sys.stdout. I am here assuming that's what pdb is checking for to check if it is a terminal, but I might be wrong.
With a little back of delay, thanks for the info. I finally found out that the following two entries from my luigi.cfg
file prevented pdb from spawning a debugger for the example above:
[worker]
timeout=600
keep_alive=True
By disabling these two items, I am now finally able to debug luigi tasks as expected. I should really have found this earlier ... :-)
Regarding the issue, it would still be nice if pdb could also work with more configs and multiple workers, but I am leaving it to the maintainers to decide whether this is a valid feature request or out of scope, and close the issue in the latter case.