galaxytools icon indicating copy to clipboard operation
galaxytools copied to clipboard

Deadlocks when testing ImageJ tools

Open abretaud opened this issue 7 years ago • 6 comments

A strange problem: when testing with planemo some imagej tools (https://github.com/bgruening/galaxytools/blob/master/tools/image_processing/imagej2/imagej2_create_image.xml for example), the test never ends. When doing a ps the ImageJ process is in T state (=stopped) and it won't resume. Strangely, if running the tool_script.sh by hand from the workdir, it runs fine and gives the correct result.

I guess it could be some kind of buffering problem with multiple subprocess.Popen and ImageJ itself launching jython. But I couldn't find a way to fix this...

We hit this problem with Sylvain from @bgo-bioimagerie

abretaud avatar May 19 '17 06:05 abretaud

@abretaud yeah, the Java-Python bridge has some problems here. It all magically works if you use a real job runner not the local one. E.g. in a Container. ping @gregvonkuster

bgruening avatar May 19 '17 07:05 bgruening

@abretaud This is the line of code in the local job runner that is blocking the Java call: https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/jobs/runners/local.py#L100. Commenting out the preexec_fn=os.setpgrp parameter will unblock the local job runner for these tools.

gregvonkuster avatar May 19 '17 11:05 gregvonkuster

@abretaud is this working for you now? I have put together a new version of the container here: https://github.com/bgruening/docker-galaxy-imaging

bgruening avatar Jun 01 '17 13:06 bgruening

I think I still have the problem, unless you mean you changed something for the local job runner?

I just tried replacing preexec_fn=os.setpgrp by preexec_fn=os.setsid and it seems to fix the deadlock problem, and it should do the same job on the created process (by looking at the doc). It feels a little scary though to touch this kind of code! But I can make a PR of course.

The new container version is cool thanks! Did you made some progress @bgo-bioimagerie? (and will you be at GCC by the way?)

abretaud avatar Jun 07 '17 18:06 abretaud

@abretaud we decided to not PR this code as no one will use the local-runner in production. But it's hard to test :(

bgruening avatar Jun 07 '17 19:06 bgruening

ok, I understand, perfectly reasonable to let this code as it is

abretaud avatar Jun 07 '17 19:06 abretaud