rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

RFC 72: Background workers

Open RealOrangeOne opened this issue 3 years ago • 17 comments

Great Work Guys, KEEP GOING PLEASE One NOTE: make it an optional dependency, because currently many developers don't need all features of wagtail that may need background workers or any automated tasks

hazho avatar Sep 27 '21 15:09 hazho

make it an optional dependency

Completely agree. Wagtail can't be assuming anything on how people are setup. Some hosting environments don't have the ability to run background tasks like this, and we don't want to limit where we can run.

because currently many developers don't need all features of wagtail that may need background workers or any automated tasks

This may be true for the things which are currently run in the background, but as this integration develops, things which currently block web requests will be moved over too, possibly allowing more features if we know they're not time sensitive. It's entirely possible some people won't need the background workers, but I highly doubt that all of those people wouldn't see a benefit from deploying them. Whether that deployment is worthwhile to them is obviously 100% subjective and not something we can assume, again hence making it both optional and opt-in.

RealOrangeOne avatar Sep 28 '21 08:09 RealOrangeOne

I've re-written the implementation chunk of this RFC, specifically to change to more of a backend-based approach rather than having Wagtail implement a lot of it itself or depend fully on Huey. This makes the implementation far more generic and powerful, and reduces the amount of complexity added into Wagtail itself.

Thanks to @tomusher for starting the thoughts around this new direction and initial sketches.

RealOrangeOne avatar Oct 18 '21 13:10 RealOrangeOne

Great work writing this up... and I agree this would be very beneficial to have in Wagtail. While the lowest friction option is to use an existing task scheduler system (Huey, Celery, etc.) those are going to have a couple design challenges that will make this feature harder to adopt.

Data storage: Huey supports SQLite and Redis, celery supports Redis. It would be really nice if the datastore could be the same as the wagtail database to avoid adding another layer of complexity. That means task queuing would ideally be backed by Django models. The database is already the source of truth... there is no reason to require yet another service in the mix.

Task runner process: The hosting provider will most likely have to run a second process (aside from WSGI/server) for the task scheduler. It would be really beneficial if whatever task scheduler backends provided by wagtail have a uniform way of spawning this process from Python, similar to how WSGI works (e.g. Django has a wsgi.py file which can be run according to a standard interface). Huey-django uses a management command to spawn this process.

Management of this process becomes more complicated with distributed setups (e.g. people running wagtail via docker images, multi-server setups, etc.). Because no one backend would be in charge of running the process, so you'd want a dedicated process server. I'm not saying we need to solve for that scenario, just provide official guidance on how to handle it.

vsalvino avatar Oct 18 '21 19:10 vsalvino

It would be really nice if the datastore could be the same as the wagtail database

The ORM backend will yes use the same database as Wagtail is using to avoid complexity.

The hosting provider will most likely have to run a second process

The original Huey implementation of this RFC allowed both a long running process, and a cron-based "run everything then terminate" style executor. Whilst it's down to whatever task runner a user runs as to how they want to run the actual workers, I think yes the ORM backend in Wagtail should have the ability to run under the cron-style runner so it doesn't depend on a dedicated process. In theory it'd be possible to do things in the same process, but that'd depend strictly on ASGI.

Management of this process becomes more complicated with distributed setups It does, but it's not difficult. If you are running the application inside Docker, then just changing the entrypoint to run say the worker rather than say gunicorn. Making sure the cron jobs or background process runs becomes out of scope for Wagtail to define, so we can be as versatile as possible, however these 2 methods are very standard, and anyone deploying an application which requires background processes will likely already be familiar with running with them.

RealOrangeOne avatar Oct 19 '21 16:10 RealOrangeOne

Discussion thread -> https://github.com/wagtail/wagtail/discussions/7750

(Specific comments on PR still more than welcome!)

RealOrangeOne avatar Jan 06 '22 14:01 RealOrangeOne

Saw this Python library pop up and thought it would be a good link to go here https://rocketry.readthedocs.io/en/stable/

Not saying we should use it but it has a nice API and may serve as some inspiration for parts of this RFC.

lb- avatar Sep 30 '22 06:09 lb-

I discovered this one as well that is based on Postgres https://procrastinate.readthedocs.io/en/stable/discussions.html#how-does-this-all-work

No additionnel dependencies needed, only Postgres

fabienheureux avatar Jan 20 '23 11:01 fabienheureux

👋 for anyone interested in this, just wanted to note we’ve provisionally scheduled this on the Wagtail roadmap for version 6.1* (May 2024 release, see RFC 91). I believe @gasman said he was keen to lead on this.

thibaudcolas avatar Feb 05 '24 13:02 thibaudcolas

Hello, everyone! Just chiming in with a package suggestion. There is django-q2, which integrates seamlessly with Django and supports multiple backends, including the Django ORM, Redis, etc.

Tobi-De avatar Feb 05 '24 17:02 Tobi-De

Hi all.

There's a massive appetite for a solution like this in Django itself.

As per a fedi-thread, I'd like to suggest making this RFC into a DEP and pushing on getting it into Django.

From the thread:

Only issue I could see is we're due to work on this very soon.

To which...

Suggestion: Put it in a third party package (maybe with an import shim in Wagtail if you want it easy to install or automatically installed) and lets begin the Merge to Django conversation. Targeting 5.2 or 6.0 gives time for bugs to show, and folks on 4.2+ can use the external package now. That’s the way.

My view is that Django should have had a story here a while back. It's something that should be part of the core framework. That you're putting in the effort here, it's time. We should point that at core, and then make it happen.

I'm happy to help with a DEP and play the shepherd role there if that's of help.

(Also, ref @Tobi-De's comment, Django-Q(2) has a lot of good bits to take inspiration from.)

carltongibson avatar Feb 07 '24 07:02 carltongibson

@carltongibson thanks! :heart: Expect a draft DEP from me in the coming days.

We discussed this in a Wagtail Core Team meeting this morning, and agree it makes sense to create the initial draft package externally to both Wagtail and Django, to allow Wagtail to use it sooner, with the intention to upstream it into Django in parallel.

RealOrangeOne avatar Feb 07 '24 10:02 RealOrangeOne

@RealOrangeOne AWESOME! 🤩

carltongibson avatar Feb 07 '24 10:02 carltongibson

@carltongibson thanks! ❤️ Expect a draft DEP from me in the coming days.

We discussed this in a Wagtail Core Team meeting this morning, and agree it makes sense to create the initial draft package externally to both Wagtail and Django, to allow Wagtail to use it sooner, with the intention to upstream it into Django in parallel.

Simply magnificent, the best thing I've read this morning.

Tobi-De avatar Feb 07 '24 11:02 Tobi-De

Hey, do you know if this is fully implemented? Ig this was one of the past GSOC ideas. If it's not, can it be done in GSOC 2024?

kituuu avatar Mar 05 '24 18:03 kituuu

Hey, do you know if this is fully implemented? Ig this was one of the past GSOC ideas. If it's not, can it be done in GSOC 2024?

This idea has moved to django itself so this framework is going to be developed and shipped as part of django once it's rfc planning is complete and merged

salty-ivy avatar Mar 05 '24 20:03 salty-ivy

The core implementation of this is now a DEP: https://github.com/django/deps/pull/86.

It's likely Wagtail will be one of the first deployed uses of it, but for now the proposal process and discussion has moved there.

RealOrangeOne avatar Mar 06 '24 08:03 RealOrangeOne

👋 I’m going to close this now, as the RFC has been replaced by a DEP. There’ll be follow-up work needed in Wagtail to leverage those new capabilities but this either won’t require an RFC, or would require one that’s more specific to the use case.

thibaudcolas avatar Jun 14 '24 15:06 thibaudcolas

Just to add, a draft PR has been made for a POC of turning Wagtail's mechanisms into tasks for django-tasks: https://github.com/wagtail/wagtail/pull/12040

laymonage avatar Jun 14 '24 15:06 laymonage