parallel icon indicating copy to clipboard operation
parallel copied to clipboard

Segfault in Events::poll

Open avolver opened this issue 6 years ago • 7 comments

Hello Joe!

I get an segfault when thread B ($reader) receives a message from channel of thread A ($writer), but thread A is already closed by parallel\Runtime::close(). This issue is only reproduced if thread A sends an object.

Reproducing test case, segfault backtrace, screenshot from IDE debugger in attachment parallel-issue-110.tar.gz

Information about my environment: Linux 5.3.13-arch1-1, PHP 7.4.0, parallel build from 3dc71f8.

This issue continues this one: https://github.com/krakjoe/parallel/issues/86

avolver avatar Dec 09 '19 14:12 avolver

I managed to reproduce this, and I can see the reason for it ...

Can you confirm that the fault only occurs with opcache disabled ?

krakjoe avatar Dec 10 '19 03:12 krakjoe

It also crashes with opcache.

Env: Linux 3.10.0-1062.7.1.el7.x86_64 #1 SMP

PHP 7.4.0 (cli) (built: Dec  9 2019 11:42:40) ( ZTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies

I attach phpinfo.txt may it can help.

dotfry avatar Dec 10 '19 06:12 dotfry

Set opcache.enable_cli ?

krakjoe avatar Dec 10 '19 06:12 krakjoe

With enabled opcache (opcache.enable_cli=1) segfaults gone. So they happens only with disabled opcache.

dotfry avatar Dec 10 '19 09:12 dotfry

Great! When opcache is enabled in cli — the fault is not reproduced in just a little modified testcase

➜ php testcase.php
Good.

➜ echo $status
0

➜ php -v
PHP 7.4.0 (cli) (built: Dec  9 2019 14:26:33) ( ZTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
    with Zend OPcache v7.4.0, Copyright (c), by Zend Technologies
    with Xdebug v2.8.0, Copyright (c) 2002-2019, by Derick Rethans

➜ php -i | grep "opcache.enable_cli"
opcache.enable_cli => On => On

But the fault is still there when the opcache is disabled.

My current solution is to turn on opcache.enable_cli on production servers.

avolver avatar Dec 10 '19 09:12 avolver

There is definitely a flaw in copying logic, parallel assumes the class entry is going to be available (and it is when opcache is enabled) even after the runtime is destroyed.

While I can think of ways around this problem, I'm reluctant to actually do anything, at least until I had more thinking time.

krakjoe avatar Dec 10 '19 09:12 krakjoe

We enable opcache, because it was an error that it was disabled in cli. Take your time :)

dotfry avatar Dec 10 '19 10:12 dotfry

To work around this would introduce a bunch of complexity, which I retreated from when it was first brought up.

Nothing has changed, the solution is still complex - it essentially means we have to copy code out of opcache.

The simplest thing to do is just declare that parallel depends on opcache, unfortunately there's no way to actually do that.

This isn't resolved but I don't really think a resolution exists, so I'm closing this now.

krakjoe avatar May 18 '24 22:05 krakjoe