frankenphp icon indicating copy to clipboard operation
frankenphp copied to clipboard

Add ability to limit worker execution time

Open 7-zete-7 opened this issue 1 year ago • 26 comments

Currently, in worker mode, the execution time limit is entirely PHP's responsibility via the max_execution_time directive. However, there are cases when PHP does not react to the execution time exceeding this directive.

One example of the problem of ignoring the max_execution_time directive was described in https://github.com/dunglas/frankenphp/issues/1162: when connecting to an unavailable database, PHP hangs and stops following the max_execution_time directive. This behavior leads to blocking workers until they finish. This eventually leads to "Connection timeout" errors.

Similar implementations already exist in:

  • Caddy HTTP Server (see https://caddyserver.com/docs/caddyfile/directives/php_fastcgi#read_timeout)
  • Nginx (see https://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_read_timeout)
  • Apache HTTP Server (see https://httpd.apache.org/mod_fcgid/mod/mod_fcgid.html#FcgidBusyTimeout)

Unlike the above implementations, FrankenPHP has the ability to communicate with PHP. It would be great if the execution time could also be configured from PHP code, or if FrankenPHP would react to the call to the set_time_limit function.

7-zete-7 avatar Nov 29 '24 20:11 7-zete-7

Is PHP in your case actually hanging during a request or while booting up the worker script? If PHP becomes unresponsive even though there's a max_execution_time, then yeah we would need to destroy and re-create the thread.

AlliBalliBaba avatar Dec 08 '24 13:12 AlliBalliBaba

In my case, PHP hangs while processing the request (at the application level).

And yes, PHP does not respond to max_execution_time, stream_set_timeout and PDO::ATTR_TIMEOUT.

7-zete-7 avatar Dec 09 '24 08:12 7-zete-7

Was this solved? Just had a similar hang with a service running 1.3.3 where the max_execution_time of 30s did not kick in and seemingly kept hanging for a long time. It has now been upgraded to 1.3.6 hoping it doesn't happen again.

taisph avatar Jan 09 '25 09:01 taisph

There's a PR in the works right now that allows automatically scaling, starting and stopping threads #1266 (kind of what FPM does with processes, but we're using more lightweight threads instead). That PR should improve the 'hanging' when a database is not reachable. I'm planning to add more ways to timeout threads right afterwards.

AlliBalliBaba avatar Jan 09 '25 11:01 AlliBalliBaba

Is it possible this is the behavior occurs in non-worker mode as well? We seem to be experiencing it. We don't have worker mode enabled (unfortunately we're not aware of any explicit way to confirm it other than we don't have the configs set, and the symfony runtime isn't being executed for it). We have some SQL queries that are taking longer than our Caddy write timeout, and it starts to affect our API as the thread is consumed until the query is killed. Because we use RDS proxy, we can't set query timeouts or the proxy will pin the connection disabling multiplexing.

nesl247 avatar Feb 06 '25 14:02 nesl247

It's possible this is a bug in php itself. max_execution_time's behavior depends upon the OS/implementation. (see: https://externals.io/message/123331 for the gory details)

That being said, we are using zend-max-execution-timers -- at least on our docker images -- so it should be based on wall-clock time and io shouldn't affect it. I can only think that a signal is being held up somewhere.

withinboredom avatar Feb 06 '25 23:02 withinboredom

@withinboredom Do you think it would make sense to just send SIGPROF from the go side via a timer? That would also allow counting the stalling time towards the execution time.

AlliBalliBaba avatar Feb 06 '25 23:02 AlliBalliBaba

In my case we're using the official frankenphp images on arm64 Linux in kubernetes in case that helps at all.

nesl247 avatar Feb 07 '25 00:02 nesl247

It looks like this may be in the MySQL lib itself because it is running the query synchronously.

Have you tuned/checked net_write_timeout and net_read_timeout in your MySQL my.cnf file to ensure it is acceptable? I'd tune those so that run-away queries will die and return control back to php.

withinboredom avatar Feb 10 '25 20:02 withinboredom

We're on Postgres and use RDS proxy which has issues with pinning so we can't set the equivalent statement timeout. Thus it has to be on the web server side to handle the timeouts that are set regardless of if it's in blocking IO (if possible). In our case for example we would be fine with the thread being killed and returning a 504 error or something if it was feasible.

nesl247 avatar Feb 11 '25 03:02 nesl247

lol, for some reason I thought it was MySQL. 🤦 I'll take a closer look over there.

In our case for example we would be fine with the thread being killed and returning a 504 error or something if it was feasible.

Since the timers work based on OS signals, I suspect any signal we try to send the worker will also get blocked. Thus, we will probably have to find another way around this.

withinboredom avatar Feb 11 '25 07:02 withinboredom

Is there a way to easily reproduce this without a hanging RDS proxy? Would make it easier to test if sending an external signal to kill the thread mitigates it.

AlliBalliBaba avatar Feb 11 '25 09:02 AlliBalliBaba

Just use the following query:

SELECT pg_sleep(300);

withinboredom avatar Feb 12 '25 14:02 withinboredom

Yeah sending SIGSEV from the go side does indeed fix this. I have a PR lined up for after #1266 is merged (is SIGSEV even the correct signal to send in this case?)

Edit: I meant SIGPROF

AlliBalliBaba avatar Feb 13 '25 08:02 AlliBalliBaba

Pretty sure SIGSEV is one we wouldn't want to use. That's a segfault and would prompt most people to report it to php-src and not frankenphp if they notice the signal.

withinboredom avatar Feb 15 '25 08:02 withinboredom

@AlliBalliBaba has it been fixed ? We are on franken 1.4.1.

jsamouh avatar Mar 12 '25 14:03 jsamouh

The next version of FrankenPHP will be able to dynamically scale threads at runtime (as a first step), which will allow better efficiency during these latency spikes.

What's still on my Todo, (but haven't found the ideal implementation yet):

  • timing out requests that have waited too long for a thread
  • killing threads from the go side (in the worst case scenario)

AlliBalliBaba avatar Mar 12 '25 16:03 AlliBalliBaba

Any chance to have a new version with https://github.com/dunglas/frankenphp/pull/1266 ? or too soon ?

jsamouh avatar Mar 12 '25 22:03 jsamouh

@7-zete-7 if you define max_execution_time on mySql to value less than php max_execution_time, you will receive : Query execution was interrupted, maximum statement execution time exceeded So PHP wont hang anymore. (I made the test)

Can it be a solution ?

jsamouh avatar Mar 13 '25 15:03 jsamouh

Hi @jsamouh!

Thanks for your interest in this issue!

if you define max_execution_time on mySql to value less than php max_execution_time, you will receive : Query execution was interrupted, maximum statement execution time exceeded So PHP wont hang anymore. (I made the test)

Can it be a solution ?

For those who have this problem when executing MySQL queries, this may be really useful information.

However, this issue does not concern the issue of executing MySQL and PostgreSQL queries, but rather the uncontrolled blocking mechanisms of PHP in general.

When working with PHP via other interfaces (CLI, CGI, FastCGI), the blocked PHP process is restarted and work continues. Because of this, such behavior of PHP remained acceptable.

When working via FrankenPHP workers, restarting or simulating a restart of the PHP process is not so easy.

7-zete-7 avatar Mar 13 '25 15:03 7-zete-7

Hi @jsamouh!

Thanks for your interest in this issue!

if you define max_execution_time on mySql to value less than php max_execution_time, you will receive : Query execution was interrupted, maximum statement execution time exceeded So PHP wont hang anymore. (I made the test) Can it be a solution ?

For those who have this problem when executing MySQL queries, this may be really useful information.

However, this issue does not concern the issue of executing MySQL and PostgreSQL queries, but rather the uncontrolled blocking mechanisms of PHP in general.

When working with PHP via other interfaces (CLI, CGI, FastCGI), the blocked PHP process is restarted and work continues. Because of this, such behavior of PHP remained acceptable.

When working via FrankenPHP workers, restarting or simulating a restart of the PHP process is not so easy.

just to be sure, except Myql and PDO timeout that dont work very well , what can cause uncontrolled blocking ? HTPP call is ok Redis services ? socket call ?

jsamouh avatar Mar 13 '25 17:03 jsamouh

@jsamouh I think we can have a new release soon, depends on how busy @dunglas is since he manages releases.

AlliBalliBaba avatar Mar 16 '25 16:03 AlliBalliBaba

Good point, @jsamouh!

It would be really useful to have a set of reproducers of this issue. To be able to verify that the error is not only on the PDO driver side, and to be able to quickly check that the problem has actually been fixed.

7-zete-7 avatar Mar 17 '25 06:03 7-zete-7

@jsamouh I think we can have a new release soon, depends on how busy @dunglas is since he manages releases.

yeah .. let us know @dunglas. this PR is the only blocker for Franken to be God Like :-D (IMHO)

jsamouh avatar Mar 19 '25 14:03 jsamouh

any update on this ?

jsamouh avatar Apr 02 '25 14:04 jsamouh

1.5.0 now has max_threads and max_wait_time

frankenphp {
    max_threads 40.   # max amount of threads to scale at runtime
    max_wait_time 10s # max time to wait for an available thread
}

No hard tread termination timeouts yet though apart from regular PHP timeouts.

AlliBalliBaba avatar Apr 03 '25 23:04 AlliBalliBaba