action-scheduler
action-scheduler copied to clipboard
WP CLI worker
Instead only using cron to run wp cli, I believe adding ability like worker/watcher/listener is better for many use case and background processing.
you can take look the Laravel's queue for reference https://laravel.com/docs/8.x/queues#running-the-queue-worker
This is closer to the spirit of Laravel's queue worker: https://github.com/Automattic/Cron-Control. Turns WP cron into the worker/watcher model, and runs jobs through CLI.
Not exactly the same once AS is involved, but then it's just a matter of starting up AS queues through cron jobs as often as needed.
This sounds like a valuable discussion and I definitely agree we could expand the range of options available for processing queues.
I believe adding ability like worker/watcher/listener is better for many use case and background processing.
How are you currently running the queue, and, what specific problems are you facing with the current model? Getting to grips with some specifics could help us as we develop our understanding of current challenges—though we already know about some of these through existing feedback, and can imagine others, we'd love to hear about your own production experience.
It would be nice to have a command that would wait for jobs to hit the action scheduler. Right now I have it configured to only run on on the CLI using the wp action-scheduler run
command. However, this task will stop as soon as any backlog has been caught up and processed. It would be nice to use a tool like Supervisord to watch the process and keep it running even when the backlog is empty. My team and I have faced issues where tons of processes were getting created when trying to keep the queue running even when the backlog was empty. (Still trying to resolve this issue)
For some more context on the cron-control model of scaling AS:
-
The cron-control plugin moves WP cron events into their own table, and exposes a CLI & REST interface for getting/running cron events.
-
There is a runner, built in Go, that consistently looks for "due" cron events and runs them - allowing for true parallelism if the container/host has the threads for it: https://github.com/Automattic/Cron-Control-runner
-
By default AS has 1 recurring cron job that processes actions, but this has limitations if talking about a truly large number of actions as you need concurrent queues processing to prevent falling behind. So with the runner mentioned above, we just need to schedule some extra cron events that will run action scheduler queues concurrently, and that is where this mini-plugin comes in: https://github.com/Automattic/vip-go-mu-plugins/blob/79427058e33551f34009ae89b79c68a82a16aee3/cron/action-scheduler-dynamic-queue.php#L9-L21. It checks if things are falling behind, and schedules cron events to run that will each kick off a concurrent AS queue.
The implementation could be greatly simplified if you can tailor the scale manually rather than having to dynamically scale. As an example:
-
Create a custom WP CLI command (
wp custom-as-queue
) that basically just runsdo_action( 'action_scheduler_run_queue', 'my custom queue' )
. This will process actions in a queue for however longaction_scheduler_queue_runner_time_limit
is set to (lets say 120seconds), then will kill itself off. You don't want to go too high on a time limit, else memory can start to fill up and slow down the request. -
Have a script that runs
wp custom-as-queue
every ~120seconds. -
Make something that can trigger a concurrent amount of those scripts, whatever the container/host in question can reasonably handle.
I'd also like to see this as an option for advanced developers.
Ideally would be nice to have a constant that we can define to disable all the wp_remote_post at the end of the shutdown action. define('AS_DISABLE_SHUTDOWN_TRIGGERS', true);
or something like that.
Also a new command that doesn't exit out when no task left. Instead, it'll just do a sleep and wait for new tasks to come in. wp action-scheduler listen
or something like that.
With that, then one can setup supervisor or pm2 to keep that process running on the server. Even the process timed out or exit, it'll re-spawn a new one.
With this, it can also prevent error on some "managed WordPress hosting", who's name shall not be named, throttles or limits the wp-ajax calls to "improve" performance.
Thanks for all the notes and ideas.
We did do some light experimentation at the end of last year, and looked at a model involving a single queue server running alongside one or more queue clients (each of those being its own process, stood up by a WP CLI command).
The idea was the server continually monitored the queue and fed actions to the next available client. In suitable environments, this increases concurrency greatly and (assuming normal queue runners are disabled) eliminates the potential for database deadlocks during the claim process, because it effectively eliminated the idea of claiming blocks of actions.
However, as the above comments show, there are lots of other ways to tackle this general problem (many of which are simpler and therefore may be more robust).
Ideally would be nice to have a constant that we can define to disable all the wp_remote_post at the end of the shutdown action. define('AS_DISABLE_SHUTDOWN_TRIGGERS', true); or something like that.
We do have an existing hook that can be used to prevent this:
add_filter( 'action_scheduler_allow_async_request_runner', '__return_false' );
Does that meet your needs, or are there cases it doesn't cover?
Somehow I missed that filter. That's good to know. Thanks.
Also a new command that doesn't exit out when no task left. Instead, it'll just do a sleep and wait for new tasks to come in.
Have to be really cautious of memory leaks throughout the application if you go this route. Things like the local object cache or DB query history will continually fill up over time, eventually leading to an OOM, at usually a bad time.
Running commands on a schedule that kill themselves off based on a reasonable timelimit just helps avoid having to chase down various plugins that might have memory leaks (which can be a lot since the normal php request cycle doesn't really punish memory very often)
The above, plus (depending on how things are configured) different types of problems relating to cached values can surface, and can be especially problematic in very long running processes.
We looked at this again and didn't see an actionable next step or indeed sustained interest / an urgent need for this enhancement. I'll close this issue for now, but if anyone would like to add more context about why this enhancement is important, how it should work, and what it would enable, then feel free to reopen.