apscheduler icon indicating copy to clipboard operation
apscheduler copied to clipboard

APScheduler 4.0 progress tracking

Open agronholm opened this issue 4 years ago • 189 comments

I'm opening this issue as an easy way to interested parties to track development progress of the next major APScheduler release (v4.0).

Terminology changes in v4.0

The old term of "Job", as it was, is gone, replaced by the following concepts which are closer to the terminology used by Celery:

  • Task definition: a uniquely named callable coupled with configuration like maximum number of instances, misfire grace time etc.
  • Schedule: binds a trigger with a task definition
  • Job: queued work item for an executor (binds to a task definition, and optionally a schedule)

Also, the term "executor" is now being changed to "worker".

Notice that the terminology may still change before the final release!

Planned major changes

v4.0 is a ground-up redesign that aims to fix all the long-standing flaws found in APScheduler over the years.

Checked boxes are changes that have already been implemented.

  • [X] Async-first design, with support for asyncio and trio (via AnyIO)
  • [X] Static typing friendly (PEP 561)
  • [X] Support for serializers other than pickle
  • [X] Broader time zone support, including zoneinfo time zones (PEP 615)
  • [X] Drop support for Python < 3.7
  • [X] Calendar interval trigger
  • [X] Stateful triggers
  • [X] threshold value for AndTrigger (resolves issues with contained IntervalTrigger instances)
  • [X] The interval trigger should start right away and not after the first interval (#375)
  • [X] Persistent store sharing among multiple schedulers (arguably the most needed feature ever for APScheduler)
  • [X] Decoupling of schedulers and workers
  • [x] Schedule-level jitter support
  • [x] Context-local job metadata information
  • [x] Easy launching of tasks immediately without needing a schedule
  • [x] Failure resilience for persistent data stores (so they don't crash the scheduler on a temporary outage)

Potential extra features I would like to have:

  • [ ] Support for tags in task definitions, schedules and jobs (#798)
  • [ ] Stateful jobs
  • [ ] Ability to cancel jobs
  • [ ] Timeouts for jobs
  • [ ] "threshold" value for OrTrigger (#453)

You will notice that I have dropped a number of features from master. Some I may never add back to v4.0, even if requested, but do voice your wishes in this issue (and this issue only – I will summarily close such requests in new tickets). Others have been removed only temporarily to give me space for the redesign.

Features on the chopping block

  • Twisted scheduler (may be usable through the async scheduler if AnyIO ever gets Twisted support)
  • Tornado scheduler (just use the async scheduler)
  • Gevent scheduler (does not play well with the new architecture)
  • ~Qt scheduler (difficult to test/maintain)~
  • Redis as a data store (may not have sophisticated enough querying capabilities)
  • Rethink data store (the company has gone belly up some time ago)
  • Zookeeper as a data store (may not have sophisticated enough querying capabilities)

Being on the chopping block does not mean the feature will be gone forever! It may return in subsequent minor release or even before the 4.0 final release if I deem it feasible to implement on top of the new architecture.

agronholm avatar Sep 29 '20 21:09 agronholm

The master branch is now in a state where both the async and sync schedulers work, albeit with a largely incomplete feature set. Next I will focus on getting the first implementation of shareable data stores, based on asyncpg. I've made some progress on that a while back but got sidetracked by other projects, particularly AnyIO.

agronholm avatar Sep 29 '20 21:09 agronholm

Regarding Twisted scheduler on the chopping block for APScheduler v4.

My main OSS project is a multi-process app, that spins up many Twisted reactors in those processes, where several of the sub-processes use APScheduler inside the reactor (https://github.com/opencontentplatform/ocp). What would be a safe replacement scheduler if the twisted version is being removed?

codingadvocate avatar Sep 29 '20 21:09 codingadvocate

So you run multiple schedulers? Are you sharing job stores among them?

The main reason I'm thinking of dropping (explicit) Twisted support is because it carries a heavy burden of legacy with it. I will play around with it and see if I can make it work at least with the asyncio reactor. If it can be made to work with a small amount of glue, I will take it off the chopping block.

agronholm avatar Sep 29 '20 21:09 agronholm

Yes, it runs multiple instances of the schedulers - with their own independent job stores.

I understand the need for software redesigns, and I'm certainly not pushing back or trying to make more work for you. Just trying to understand what the recommendation would be. Maybe I could fall back to using APS' BackgroundScheduler since I don't spin it up until after the reactors are running? Either way, I saw the note and want to ensure I follow whatever happens on that one.

Either way, thank you for the solid project.

codingadvocate avatar Sep 29 '20 22:09 codingadvocate

Are the jobs you run typically asynchronous (returning Deferreds) or synchronous (run in threads)?

agronholm avatar Sep 29 '20 22:09 agronholm

The initial setup with creating job definitions is synchronous. Any updates to previous job definitions or newly created jobs (stored/managed in a DB) occur regularly in an asynchronous manner (LoopingCall that returns a Deferred). And all the work with job runtime (execution/management/reporting/cleanup) occurs in non-reactor threads.

codingadvocate avatar Sep 29 '20 22:09 codingadvocate

Ok, so it sounds like the actual job target functions are synchronous, correct? Then you would be able to make do with the synchronous scheduler, yes?

agronholm avatar Sep 30 '20 06:09 agronholm

If you're saying so, then yes. I defer to your knowledge there. I selected with TwistedScheduler since the user guide choosing-the-right-scheduler section said to do so when building a Twisted application.

I apologize for compounding the response with a question, but it's related. How is the thread pool and thread count handled if I use something other than the TwistedScheduler? Will the job run inside Twisted's thread pool, or inside BackgroundScheduler's thread pool? Do I need to extend both?

Does constructing the BackgroundScheduler with an explicit max_workers count (example below), do anything when it's running inside the Twisted's reactor?

self.scheduler = BackgroundScheduler({ 'apscheduler.executors.default': { 'class': 'apscheduler.executors.pool:ThreadPoolExecutor', 'max_workers': '25' } })

codingadvocate avatar Sep 30 '20 13:09 codingadvocate

Will the job run inside Twisted's thread pool, or inside BackgroundScheduler's thread pool? Do I need to extend both?

The sync scheduler (including 3.x's BackgroundScheduler knows nothing about Twisted's thread pool. The Twisted scheduler in 3.x differs from BackgroundScheduler only in that its default executor uses the Twisted reactor's internal thread pool. It doesn't even have async support!

I want to provide first class async support in APScheduler 4.x. If I can do that with Twisted without having to create an entire ecosystem of Twisted specific components, then I'm open to doing that.

agronholm avatar Oct 02 '20 16:10 agronholm

I just added a few items to description:

  • External workers
  • Schedule-level jitter support
  • Ability to cancel jobs
  • Timeouts for jobs
  • Redis as data store
  • Zookeeper as data store
  • "executor" being renamed to "worker"

agronholm avatar Oct 04 '20 09:10 agronholm

What do you think about adding optional OpenTelemetry support?

thedrow avatar Oct 05 '20 11:10 thedrow

I am open to it, but only as soon as their API stabilizes. As it stands, every beta release breaks backward compatibility. I have more important issues to work on. I don't think v4.0 will have OpenTelemetry support but I will consider adding it to a minor update release once they are in GA.

agronholm avatar Oct 05 '20 15:10 agronholm

A lot of progress has been made on the core improvements of v4.0. Vast code refactorings have taken place. The data store system is really taking shape now.

I've added "Failure resilience for persistent data stores" to the task list. It's one of the most frequent deployment issues with APScheduler, so I'm making sure that it's adequately addressed in v4.0.

I'm not sure what to do with the event system. I may rip it out entirely until I can figure out exactly how it should work. I know users will want to know when a job completes or a misfire occurs etc., so it will be implemented in some form at least before the first release.

I will post another comment when I've pushed these changes to the repository.

agronholm avatar Oct 13 '20 06:10 agronholm

I hit a snag with the synchronous version of the scheduler. I tried to use the AnyIO blocking portal system to run background tasks but I had to conclude that it won't work that way. I have an idea for that though.

agronholm avatar Oct 30 '20 09:10 agronholm

@agronholm do you have any estimate when 4.0 would be released?

jykae avatar Dec 09 '20 08:12 jykae

I had hoped at least for an alpha at this point, but the design problems in the sync version killed the momentum I had. I have not done any significant F/OSS development since. I am still committed to getting 4.0 done, but due to pressure at work I don't think I can work on it before Christmas holidays.

agronholm avatar Dec 09 '20 08:12 agronholm

@agronholm How will you make the jobstore can be shared among multiple schedulers?

williamwwwww avatar Dec 14 '20 05:12 williamwwwww

@agronholm How will you make the jobstore can be shared among multiple schedulers?

By coordination and notifications shared between schedulers. Notifications are optional but recommended, and without notifications the schedulers will periodically check for due schedules. How all this works is specific to each store implementation.

agronholm avatar Dec 14 '20 05:12 agronholm

Hello @agronholm

Impressive task list and thanks for apscheduler.

By big christmas whish is "locking" (probably the idea of persistent storage)

I use apscheduler on several web nodes each node had some workers.

Today, I inherit scheduler, store etc to add locking.

Instead of using add_job I call queue_job, create an event, everyone wakeup, the first taking the job lock it (using NX with redis + redlock algorithm). When the job pass a certain time, I mark the job as "dead" and our alerting tell us the dead job.

For me it's mandatory that a Task never belong to a worker, the job must be in queue then another worker or himself could process that task.

To achieve it I added in redis (like jobs and running keys) "ready", "locked", "dead", "failed", "done"

  • queue add in ready
  • event queued, wakeup, try to lock
  • when lock acquired, move from ready to jobs_key (which is what you use to process the job)
  • adding listener on the task, if success move to done key and release lock, otherwise move to failed key and release the lock.
  • if the job had a lock and never get ack on status, move it in dead and release (this part is tricky because a dead job depends on the nature of job)

I'm a big fan of Sidekiq (and also Faktory)

And I will be very happy with something like

In the "main"

def myfunc(x, y):
    print(x, y)

scheduler = Scheduler(...)
# register myfunc as a valid callable to avoid pickle on func
scheduler.register('myfunc', myfunc)
scheduler.start()

Then in code

# note that myfunc is in string
job = scheduler.queue('myfunc', kwargs={"x": 1, "y": 2})
print(job.status) # ready - no one process it
...
print(job.status) # pending - someone process it
...
print(job.status) # done - success

Why not Celery ?

I don't wan't to setup full celery/flower stuffs, my tasks are simple and I'm a bit lazy to repackage an entire app or split into small libs some line of codes just to allow celery running my code (and also split config, creds etc) I prefer using celery when necessary.

Don't know if I'm clear (not native english)

ahmet2mir avatar Dec 16 '20 14:12 ahmet2mir

@ahmet2mir APScheduler 4.0 already has the proper synchronization mechanisms in place.

What's still missing is the synchronous API. I've come to a realization that I cannot simply copy the async API and remove the async keywords because cancellation isn't going to work with the sync API, and AnyIO's BlockingPortal mechanism (as it is currently) is inadequate for cases where you need to start background tasks. I must address this issue first and then come back to finish the basic APScheduler 4.0 API.

agronholm avatar Jan 04 '21 10:01 agronholm

While 4.0 is being worked on, I've gone back to the 3.x branch for a bit and fixed a number of bugs and other annoyances.

agronholm avatar Jan 20 '21 17:01 agronholm

Tests on async/sync workers (formely: executors) are passing now, but the sync worker tests are strangely slow and I want to get to the bottom of that before moving forward.

agronholm avatar Jan 31 '21 20:01 agronholm

Slowness in worker tests resolved: it was a race condition in which the notification about the newly added job was sent before the listener was in place, causing the data store to wait for the 1 second timeout to expire before checking for new jobs again.

I'll move on to completing the synchronous scheduler code now. I'm also very close to releasing AnyIO v2.1.0 which is a critical dependency for APScheduler 4.

agronholm avatar Feb 07 '21 17:02 agronholm

I can't wait...

thedrow avatar Feb 09 '21 10:02 thedrow

Tests for both sync and async schedulers pass, but the tests run into delays caused by the new schedule/job notifications not working as intended, plus the sync scheduler tests are causing lots of "Task exception was never retrieved" errors outside of the actual tests which I will have to investigate. I'm considering making an alpha release once these issues have been ironed out.

agronholm avatar Feb 13 '21 23:02 agronholm

That would be very helpful.

thedrow avatar Feb 14 '21 12:02 thedrow

After hours of debugging, I finally figured out that I was needlessly creating a new task group in the worker's run() method and overwriting the outer task group as a worker attribute. The odd errors went away after I fixed that.

agronholm avatar Feb 14 '21 21:02 agronholm

I've just pushed a big batch of changes that implement data store sharing on PostgreSQL (via asyncpg) and MongoDB (via motor). There are a lot of rough edges but at least the whole test suite passes now (at least locally – CI seems to have some troubles). In the coming days I'll try to polish the code base to the point where I can at least make an alpha release.

Feel free to try it out, but you'll have to look at the test suite for some idea on how to use it since I haven't updated the docs yet. Also, the database schema will change before the final release (tasks accounting is not currently done) so expect to have to throw out your schedules and jobs.

agronholm avatar Feb 24 '21 22:02 agronholm

Is master now usable?

thedrow avatar Feb 25 '21 05:02 thedrow

Usable in the sense that basic functionality works, but I wouldn't rely on it for anything remotely important.

agronholm avatar Feb 25 '21 07:02 agronholm

jumpstarter is not production-ready yet.

thedrow avatar Feb 25 '21 12:02 thedrow

Maybe I missed it in the thread, but when you publish will it have a diff package name on pipy? Thanks!

christopherpickering avatar Feb 25 '21 21:02 christopherpickering

It will have the same name, so if you're concerned, pin your apscheduler dependencies to < 4.

agronholm avatar Feb 25 '21 21:02 agronholm

On another note, CI runs now work again. It was a bit of a head scratcher but turns out the culprit for the CI runs freezing was freezegun v1.1.0. I didn't see this locally because v1.1.0 was freshly released and I hadn't updated my dependencies lately. Pinning to v1.0.0 solved the problem.

agronholm avatar Feb 25 '21 22:02 agronholm

We use APScheduler in a Qt app to schedule backups. Works well and I plan to revamp that integration soon. So would be good if QTimer could stay. The 3.x implementation was only a few lines, given that QTimer is quite high level.

Is it still feasible to use QTimer in a Scheduler subclass and would you consider merging a PR for it? Or would APS 4 work just as well in a Qt app?

m3nu avatar Mar 02 '21 00:03 m3nu

It won't be a priority but I will definitely consider this use case.

agronholm avatar Mar 02 '21 07:03 agronholm

Looking forward to the fix for https://github.com/agronholm/apscheduler/issues/285 landing in v4.0 (or some other release) 👍

huangsam avatar Apr 29 '21 16:04 huangsam

It's been a while since the last update. My development time has mostly been spent on improving the AnyIO project, and this work has yielded the much awaited 3.0 release which will also benefit APScheduler.

My next step is to refactor the current postgresql data store into a new SQLAlchemy 1.4 based store which would work not only with postgresql but also with mysql and sqlite. I will also implement task accounting (keeping track of how many running jobs per task there are, and ensuring that the limits are respected).

One big decision ahead is how to support data stores using synchronous APIs. It may not make a lot of sense to have the synchronous scheduler use async behind scenes if the data store is fully synchronous. And supporting only the above database backends would leave a lot of users in the cold.

agronholm avatar May 10 '21 21:05 agronholm

You can use a thread pool.

thedrow avatar May 11 '21 10:05 thedrow

Hi @agronholm , is there any update on this release or on #256 in particular?

samh194 avatar Aug 13 '21 19:08 samh194

Hi @agronholm , is there any update on this release or on #256 in particular?

Initial work has landed on job store sharing, but it's lacking critical components:

  • task accounting (making sure the max number of concurrent instances of a task is respected across workers)
  • event delivery

I've started work on task accounting last week. The idea is that schedules and jobs are linked to "tasks". A "task" contains some configuration parameters like maximum concurrency, statefulness etc., and links to a callable. When you create a schedule or a job, you need to pass to the method either:

  1. the ID of a previously created task
  2. a callable object
  3. a textual reference (modulename:variablename) to a callable (does not need to be importable on the scheduler process if a persistent job store is used)

When a worker acquires a job, the task gets its counter incremented, and when the job is released, the counter gets decremented. The data store will ensure that the total concurrency of a task never goes above the limit.

I'm still working to figure out the exact semantics involved here, like what happens when a worker tries to acquire a task that has all its concurrency slots taken.

agronholm avatar Aug 15 '21 19:08 agronholm

Also, to simplify the implementation of shared data stores, I have decided not to require them to deliver events beyond the current process. This decision could be changed if there is high demand for it.

agronholm avatar Aug 15 '21 19:08 agronholm

Also, I'm now trying out a design where synchronous and asynchronous data stores have separate interfaces, and if you use a synchronous data store with an async scheduler, it will just wrap it with threads. I may have been trying too hard to deduplicate the code between sync/async, but if I make the synchronous scheduler independent of the async scheduler, things might get easier. I'm not sure this will be the final design choice that I'll take but it's something I'm exploring now.

agronholm avatar Aug 20 '21 10:08 agronholm

Tonight I managed to (hopefully) nail down the new event dispatch system which also has tests with 100% coverage. The work on sync/async schedulers and workers is also coming along nicely. I will commit all the new work to master in a big push once it's in a coherent state.

agronholm avatar Aug 21 '21 22:08 agronholm

There is a problem that is rapidly becoming apparent: the majority of shared data stores do not support any mechanism with which to be notified of external changes. This is important for job store sharing to fully work as intended.

Of the job stores implemented so far, PostgreSQL (either directly via asyncpg, or SQLAlchemy) and MongoDB are capable of providing at least some level of notifications. Even it has limitations in its notification mechanism: it can only deliver messages shorter than 8000 bytes, making it impractical for universal event delivery since such a system would have to cope with essentially arbitrary sized events. MongoDB, on the other hand, only supports this when configured to be used as a replica set.

These shortcomings made me consider again the idea of having a "side channel" for broadcasting events. I can think of at least 3 different services which might be suitable for delivering events to all interested parties:

  • Redis
  • Kafka
  • RabbitMQ (any AMQP based message server, really)

As for cases like PostgreSQL, I'm planning to have the store optionally emit a "limited" event that would wake up the scheduler on an external change. That would allow users to use a shared job store without having to run another service just for getting notifications to work.

agronholm avatar Aug 25 '21 20:08 agronholm

The data store tests are passing again after my latest batch of changes, and the event dispatch system is shaping up really well. I even managed to make the PostgreSQL data store relay its own events through the database server as asynchronous notifications, making it quite suitable for use without an external messaging system. I will certainly tweak the event system further to make sure the whole system is as observable as it reasonably can be.

My goal is also to do away with the PostgreSQL job store (currently in master only) entirely in favor of the async SQLAlchemy store. I would add its notification features to the SQLAlchemy store which would then be used if a compatible driver was detected.

On another note, I decided to rely on attrs over dataclasses. I tried really hard to love dataclasses, but seeing that you cannot have ANY optional fields in superclasses when subclasses have mandatory fields is just a showstopper problem for me (dataclasses cannot force keyword arguments, unlike attrs).

Once I get the rest of the test suite to pass again I will push the changes to Github.

agronholm avatar Aug 28 '21 15:08 agronholm

The entire test suite passes now and I have pushed the latest changes to Github. There is an intermittent failure related to the memory job store in test_remove_job() when being wrapped as an async datastore. I'm tracking down this annoying Heisenbug.

The API currently revolves around context managers, but I'm going to try to refactor it to be more convenient when used in environments that are less context manager friendly.

I didn't get task accounting done in this batch, but it will be my next focus now that the event dispatch system is in a better shape.

EDIT: the race condition has been fully fixed now.

agronholm avatar Aug 28 '21 22:08 agronholm

Just fixed a bunch of bugs and got CI to pass on all supported Python versions and platforms (except for Windows+Py3.10 where psycopg2 won't compile).

agronholm avatar Aug 29 '21 10:08 agronholm

I took a little detour of adding MySQL and SQLite to the testing matrix, and I'm glad I did because it revealed a bunch of problems. MySQL's timestamp columns don't support fractional seconds by default, and I had to refactor the SQLAlchemy data store a bit to add a workaround for this particular vendor (i.e. using a MySQL specific timestamp type that enables fractional seconds) . As for SQLite, it trashed my latest attempt to enable the SQLAlchemy store to forego the operation of marking schedules as "acquired" for a scheduler because it did not honor the row level locking. The context manager based schedule acquisition API might have been problematic for some other data store implementations anyway. But these issues have been sorted out now and the test suite passes again for all data stores. As a side note, MySQL is considerably slower here than the other back-ends and I would not recommend it to anyone.

With that out of the way, I was able to add the first implementation of task accounting. Data stores will now keep track of the number of running jobs per task, and not give workers jobs that would raise the total number for that task above its maximum concurrent job limit. This code will need a lot more tests and polish before I can say it's finished, but the basic functionality is there.

I've also refactored most of the classes to use attrs and all of them to use Python 3.10 style type annotations. I hope this doesn't inconvenience anyone (the library still remains Python 3.7+ compatible).

There are still a couple of unsolved problems with task accounting:

  1. How would the system allow tasks to have callables that aren't addressable as modulename:varname? Lambdas and local functions (functions inside functions) fall into this category. Normally when a task definition is added to the data store, this addition is communicated to other data stores but there is no way to automatically replicate the same task -> function mapping on data stores in other processes. The use case is important for local data stores, however, and it has to share the same API with persistent data stores.
  2. What should happen to jobs when their tasks are updated with different parameters? Should the existing jobs be removed? If the new specification allows fewer concurrent jobs, should we cancel queued jobs? Starting from the oldest or the newest? This is a harder problem than the previous one.

agronholm avatar Sep 05 '21 22:09 agronholm

I've now implemented an event broker system which should allow any persistent data store to be shared safely. The following implementations (in addition to the minimal local-only implementation) are present in the current code:

  • Redis
  • MQTT (via paho-mqtt)
  • PostgreSQL (via asyncpg)

This should be the last major component that was missing from the v4.0 design. From here on out it's just a matter of implementing the promised features, tinkering with the design and polishing the outcome. The documentation will also be largely rewritten but only after the code has more or less settled down.

I've done some work on persistent data stores, too. Each field now corresponds to one column in SQLAlchemy, or one BSON field in a MongoDB document. I made this change to enable more granular updates and deserialization error tracking. The downside is more frequent schema updates when/if more fields are added.

agronholm avatar Sep 11 '21 18:09 agronholm

I've now implemented the ability to create a job directly from the scheduler, without the need for a schedule. That's one more item checked off the list.

The event system got a big load of refinements done too, including context manager support for subscriptions and support for one-shot subscriptions.

One design aspect I'm still undecided on is whether to store job results in the jobs table/collection itself. Currently there is a separate table/collection for that.

agronholm avatar Sep 12 '21 22:09 agronholm

I've just pushed an implementation of schedule-level jitter support. Another item checked off the list.

agronholm avatar Sep 20 '21 21:09 agronholm

And another item crossed off: context-local job info, worker and scheduler (where available).

agronholm avatar Sep 20 '21 23:09 agronholm

Yesterday I pushed code that adds failure resilience to persistent data stores, with fully configurable parameters and exponential backoff by default. The challenge is that different data stores raise different exceptions when they fail. This is particularly problematic with the SQLAlchemy store. I've tested this against both PostgreSQL and MySQL servers, shutting them down while the scheduler is running, and I've confirmed that the scheduler starts running again if I restart the server within the configured timeout period. Larger scale testing and refinement is needed, however, to make this feature "rock solid".

This marks the completion of the last mandatory feature of APScheduler 4. I'll start implementing the bonus features now and fixing up the scheduler/worker event logic and other issues in the code, and then gradually rewrite the documentation. When that's done, I will make the first pre-release of v4.

agronholm avatar Sep 27 '21 08:09 agronholm

Yesterday I heard about OS level mechanisms that allow more sophisticated waiting techniques than what APScheduler has used so far. Up until now, APScheduler has simply calculated the time differential between the next known scheduled fire time and then used time.sleep() or similar to sleep until then (or until awakened explicitly). This method works well in most circumstances, but not in the following:

  • the host computer is put to sleep
  • the system clock is winded forward

In either of these cases, the scheduler would be blissfully unaware of the abrupt changes in the system clock and may thus sleep past the next fire time. These OS level mechanisms are something I'm now looking to implement for the 4.0 final release, but don't expect to see them in the early pre-releases. They could also be impossible to write automated tests for, so I don't know what to do about that.

agronholm avatar Oct 02 '21 10:10 agronholm

@agronholm I'm particularly interested in sharable job stores as I commented in another issue, and you pointed me to this one since it's planned for v4.

A couple of days ago, I started tinkering over v3.8.1 to get a [very basic] sharable mondgodb store support (since the system I'm building, which holds a scheduler instance, is deployed in multi-replica and thus it's critical for such replicas to access the same job store). But then I found out the feature was checked in the list above, so I started digging into master branch. Specifically, I was looking for changes in job stores and the scheduler, and I'm very pleased to see the refactoring done to enable acquire/release jobs on a store and events pub/sub to trigger jobs to executors. I wondered how you managed to "update" the wakeup time on a certain scheduler instance if another instance added or modified a job, and noticed the _wakeup_event strategy to handle such scenario. Nice and clean design.

I'm really anxious now for a release! Apologies in advance if this one was answered in any other thread but... is there any chance for v4 (beta maybe) to be released any time this month?

dariosm avatar Nov 12 '21 22:11 dariosm

I've stated elsewhere that that my goal is to get a prerelease out before the end of the year, so I'm sticking with that. I'm currently focused on upgrading another project of mine (Typeguard) but once that work is finished (at least for a pre-release), I will switch my focus back to APScheduler and determine which of the above "extra" features will make it to the pre-release. Note that before the final release, I will likely change the database schemas without notice or migration paths so you should not run any serious workloads with it yet. There is currently also a problem with the schedulers themselves which makes them continuously wake up due to a broken response logic to schedule update notifications from data stores. I plan to address this ASAP when I return to this project. Also, the largest major subproject remaining before the pre-release is updating the documentation, and I didn't want to start that until the code base was feature complete.

agronholm avatar Nov 13 '21 12:11 agronholm

I've stated elsewhere that that my goal is to get a prerelease out before the end of the year, so I'm sticking with that.

Hi @agronholm , I hope you had a wonderful start of the year. Just wondering here if there are any plans to release v4 anytime soon? (I'm specifically interested in shared job stores, as stated before). Since I'm in a hurry to resolve the latter, I have to evaluate to replace apscheduler and all my work around it, or wait for the new release to check how it works for my particular requirements.

Thanks again for the great work!

dariosm avatar Jan 05 '22 13:01 dariosm

I expect to make an alpha release this month. It will still lack some features, but the basics should be working. The data store system has been completely overhauled so it will most likely have some glitches which need to be ironed out over a lengthy period. I wanted to get a release out last month but alas, I have so many projects that need attention that I simply didn't have the time. Also, the database schema is not yet stable so at least during the alpha period, you would have to start from a clean slate from time to time if you plan to use persistent data stores.

On the upside, I think I have finally resolved the last kink that prevented the schedulers from working properly. I have not pushed that code yet but I expect to do so in next new few days, once I've had a chance to test it a little bit locally. After that, I will start rewriting the documentation and once that's done, I expect to make the first pre-release. I don't want to make any promises, but this month is the target for 4.0.0a1.

agronholm avatar Jan 05 '22 14:01 agronholm

@agronholm sorry for the bump. I'm looking for a job scheduler with Anyio support and would love to use apscheduler. Do you still have plans of releasing version 4.0.0a1? ❤️

bratao avatar Jan 29 '22 06:01 bratao

Sorry for not responding earlier. I've been crazy busy with both paid work, other F/OSS projects and my laptop just broke down. Mental exhaustion also plays a significant part in the lull in my activity with this project. I will get back to it, hopefully this Saturday. My plan still remains the same: fix the schedulers, rewrite the docs and then release 4.0.0a1.

agronholm avatar Feb 11 '22 22:02 agronholm

Just stopping by to say thank you. I have held off thus far to avoid noise/spam, but I must say I do appreciate the hard work on this. I can’t speak for everyone but I am empathetic towards the challenges of FOSS. I am very patient on this release. I’m very happy to wait as to not result in burnout.

Apscheduler is probably one of the most powerful and enjoyable pieces of software I’ve had the pleasure of using. It is great you’ve provided this to us to use for free. Thank you and take care.

rogueinkamp avatar Feb 11 '22 23:02 rogueinkamp

@agronholm Thank you, I love your work. Your wellbeing needs to come first, please, take care of yourself! Burnout is something very serious.

I recommend watching Arcane and Encanto.

bratao avatar Feb 12 '22 00:02 bratao

Sorry for not responding earlier. I've been crazy busy with both paid work, other F/OSS projects and my laptop just broke down. Mental exhaustion also plays a significant part in the lull in my activity with this project. I will get back to it, hopefully this Saturday. My plan still remains the same: fix the schedulers, rewrite the docs and then release 4.0.0a1.

@agronholm take care of yourself!

dariosm avatar Feb 12 '22 02:02 dariosm

Thank you so much for the support! That really made my day :heart:

I just pushed a bunch of commits that deliver both the scheduler fix I talked about, and the first piece of updated documentation. I have other documentation pieces in progress locally and I will be pushing them as I am able to finish them.

I've also been looking into a mechanism that would allow me to add back support for frameworks like Qt (via PySide6) or gevent. Particularly Qt support is something I'm inclined to keep from the 3.x series, given the number of questions and issues raised about it here. But this work is unlikely to land before 4.0.0a1, given how horribly late that release is already.

I've also managed to fix a bunch of common annoyances on the 3.x series. I'll be making a new release of that soon.

agronholm avatar Feb 15 '22 00:02 agronholm

I'd defer those additional features to 4.1. What we have in master will be good enough to start with for new users. Older users will wait till the appropriate time to migrate anyway.

thedrow avatar Feb 15 '22 16:02 thedrow

Any ETAs for V4 release? I appreciate your work 👍

clot27 avatar Mar 04 '22 06:03 clot27

+1 for canceling jobs. Thank you for this great module!

jenfredwell avatar Mar 10 '22 02:03 jenfredwell

Any ETAs for V4 release? I appreciate your work +1

I've given too many ETAs that have all flown by, so I'm not doing that anymore. For v4.0.0a1, the goal is to get, at the minimum, the documentation done and the scheduler working for the most common use cases.

On that note, some practical testing by Donald J. Welch and myself resulted in some interesting findings. I was able to make sample ASGI middleware that should work with any ASGI framework. As for WSGI, it turns out that the synchronous API is not suitable for use in that environment and must be modified so that you can just start the scheduler without using a context manager (which must be used on the async side due to AnyIO/trio compatibility, as they enforce structured concurrency).

I have some time to work on APScheduler this weekend.

agronholm avatar Mar 12 '22 17:03 agronholm

Another week has gone by, and the user guide rewrite is well underway. I now also have some PoC code to make APScheduler integrate with both WSGI and ASGI web apps. Writing the WSGI code revealed some issues with the synchronous API, however. Given that using the asynchronous scheduler with async with is a must (due to its reliance on AnyIO task groups, which are based on Trio), I thought that I could mirror that in the synchronous API too, but it turns out that supporting WSGI that way is impossible. The scheduler has to be started as an import side effect and has to remain running when the module has finished running, making it impossible to mandate its use through a context manager (with Scheduler() as scheduler: ...). As soon as I discovered this problem, I changed the synchronous scheduler API so that it can be run in a background thread, with an atexit handler that will gracefully stop it once the process is about to exit.

In the process of writing the user guide, I ran into another API design issue. The 3.x design where schedules can be "tentatively" added to the scheduler, and only "really" added to the store when the scheduler is started has been a huge pain point for users, which is why I got rid of that in v4.0. However, I hadn't yet gotten to the point of giving the user an opportunity to update/replace persistent schedules before the scheduler starts running (and potentially processing obsolete schedules). The way it looks like now is that the scheduler will have to start the data store (and the event broker) on demand, if the scheduler is accessed before its main loop is started. This is hopefully the last API change I will have to make before v4.0.0a1, and I hope I won't run into major issues modifying the synchronous data store/event broker design to accommodate this change.

In short, here is an example of how one would use the synchronous scheduler in a WSGI app with a PostgreSQL data store:

scheduler = Scheduler(SQLAlchemyDataStore.from_url('postgresql+psycopg2://user:pass@host'))
scheduler.add_schedule(my_function, CronTrigger(hour=0, minute='*/30'), conflict_policy=ConflictPolicy.replace)
scheduler.start_in_background()

Constructive criticism is welcome, as always.

agronholm avatar Mar 21 '22 00:03 agronholm

This is the most elegant scheduler I've ever used, and I've been looking forward to the release of 4.0. All the hard work you've done for all of this. Synchronous and asynchronous support is critical. Like I tend to use FastAPI instead of Flask when using frameworks.

binbjz avatar Mar 21 '22 09:03 binbjz

Hey @agronholm, congratulations and thank you for the awesome work! Whats left for the v4 release? Can I use master now "safely" as is? Cheers

p-fernandes avatar May 04 '22 11:05 p-fernandes

Mostly just getting the documentation rewrite finished. I have so many projects that I'm working on that there can be a few weeks of inactivity in any given project because I'm focused on something else.

Some enterprising individuals have already started using master, and reported a number of bugs, all of which I've fixed. Nothing has been reported in a while, so if you know what you're doing, and are expecting some rough edges here and there, go ahead!

agronholm avatar May 05 '22 17:05 agronholm

This library looks great, but I'm working in a flask app and would love that start_in_background feature for the synchronous scheduler. Any chance you could put the work-in-progress up in a branch? Also, maybe making an alpha release, even without docs being updated, would be useful for getting more testing+feedback?

ccope avatar Jun 10 '22 00:06 ccope

I'm again sorry to make everyone wait, but with my summer vacation approaching, it's looking like I will have enough time on my hands to finish the work left for getting the alpha release out before August. The WSGI changes turned out to require much more substantial changes than I had hoped for, but the end result feels like it was worth the effort. The API is cleaner and easier to understand. A downside of this change is that the sync API is no longer completely symmetrical with the async one, but that's something I can live with.

Also, maybe making an alpha release, even without docs being updated, would be useful for getting more testing+feedback?

Anyone adventurous enough to try it out can just install directly from Github. I won't need to make a release for that. Several people have already done this. As for the WSGI changes, the code is not yet in working order but will be Real Soon (tm). I will push that to master once the tests pass.

agronholm avatar Jun 26 '22 09:06 agronholm

Hey, I am looking forward to those changes! This is a "volunteer project" and I think there should be no need for you to apologize. :) I think everybody will appreciate the new API and be thankful for all of the work you've put in :+1:

Anyone adventurous enough to try it out can just install directly from Github. I won't need to make a release for that. Several people have already done this.

I would like to point out a different reason to have alpha releases available: It allows users to publish PyPI packages that depend upon the alpha releases, which is not possible with pointing to the GitHub master branch. At least I am not aware of how this can work (maybe I didn't do enough research?). This is maybe a niche use case, but I like to work with alpha releases during the development and test phase and use those when distributing alpha releases of my projects. I just wanted to add this to give you a potential reason why some people might find alpha releases helpful.

With the upcoming alpha release or not, I am grateful for all of the work you have put into this project. Thank you :pray:

kai-tub avatar Jun 26 '22 10:06 kai-tub

I'm again sorry to make everyone wait, but with my summer vacation approaching, it's looking like I will have enough time on my hands to finish the work left for getting the alpha release out before August. The WSGI changes turned out to require much more substantial changes than I had hoped for, but the end result feels like it was worth the effort. The API is cleaner and easier to understand. A downside of this change is that the sync API is no longer completely symmetrical with the async one, but that's something I can live with.

Here here - many of us lurkers appreciate all you put into this project and your patience and diligence replying to questions and people seeking advice. Thank you for all of your effort that has created a very useful tool for all of us. Enjoy your vacation.

cfaaron avatar Jun 26 '22 13:06 cfaaron

It allows users to publish PyPI packages that depend upon the alpha releases, which is not possible with pointing to the GitHub master branch.

Respectfully, I feel this is not a great idea at this point when the API is still in a flux. Patience, please :)

agronholm avatar Jun 26 '22 21:06 agronholm

My project is waiting for your release

asihacker avatar Jul 07 '22 11:07 asihacker

Can version 4.0 be released before August 2022

asihacker avatar Jul 07 '22 11:07 asihacker

Certainly not the final version. 4.0a1, probably.

agronholm avatar Jul 07 '22 17:07 agronholm

I just pushed a bunch of commits that enable the WSGI use case, along with the documentation for that feature. Next, I'll be working on updating the documentation for the 4.0a1 release.

agronholm avatar Jul 18 '22 21:07 agronholm

Today I pushed a big batch of commits that updated the examples directory. This took much longer than anticipated (4 days of work) because I kept finding (and fixing) new bugs and API inconsistencies. But things are looking much better now, and it looks like I can finally focus on updating the actual documentation. I still have until the end of the week to deliver the 4.0a1 release :)

agronholm avatar Jul 27 '22 11:07 agronholm

Good news for experimenters and early adopters: Hynek Schlawack graciously just made a new release of attrs which enabled the APScheduler test suite to pass on Python 3.11.

agronholm avatar Jul 28 '22 21:07 agronholm

Good news for experimenters and early adopters: Hynek Schlawack graciously just made a new release of attrs which enabled the APScheduler test suite to pass on Python 3.11.

This is really good news. Hopefully, when python 3.11 is released, our 4.0 will be released as well.

binbjz avatar Jul 29 '22 16:07 binbjz

Good news for experimenters and early adopters: Hynek Schlawack graciously just made a new release of attrs which enabled the APScheduler test suite to pass on Python 3.11.

This is really good news. Hopefully, when python 3.11 is released, our 4.0 will be released as well.

No promises. But the documentation is coming along nicely, and still on track for the end of the month target for the first alpha. After that, I expect to add in the bonus features like cancellation, tags etc. along with fixing bugs encountered by the users. Once all the features are in and the bug hunt calms down, I'll make a beta release. How long this will take is anybody's guess.

agronholm avatar Jul 29 '22 18:07 agronholm

So, it's August already. I didn't quite meet my self imposed deadline, but if you look at the commit history, I haven't exactly rested on my laurels either. The examples were updated, docstrings written, module structure refactored and all parts of the documentation other than the user guide and the changelog are more or less up to date for v4.0.0a1 now. I just need one or two more productive evenings to get this thing to the goal.

agronholm avatar Jul 31 '22 23:07 agronholm

come on.

asihacker avatar Aug 01 '22 06:08 asihacker

come on.

?

agronholm avatar Aug 01 '22 06:08 agronholm

come on.

~@asihacker Since there is absolutely no reason @agronholm couldn't just say "forget this" and walk away, even when feeling frustrated, a minimal amount of politeness is necessary. Especially considering that he is actively working on and making good progress on 4.0.~

taybin avatar Aug 01 '22 14:08 taybin

well, it seems there is some misunderstanding here, "come on" maybe he means “fighting!”, which is "加油" in Chinese.

Jackeriss avatar Aug 02 '22 08:08 Jackeriss

Maybe "fighting" is still not the right translation, it's a little hard to translate "加油"(Add oil), but I think he just wanted to encourage you.

https://www.wikiwand.com/en/Add_oil https://translate.google.com/?hl=zh-CN&sl=zh-CN&tl=en&text=%E5%8A%A0%E6%B2%B9%0A&op=translate

Jackeriss avatar Aug 02 '22 08:08 Jackeriss

@asihacker @Jackeriss In that case, I apologize for the message above. "Add oil" is a pretty cool idiom.

taybin avatar Aug 02 '22 13:08 taybin

It means encouraging you

asihacker avatar Aug 03 '22 06:08 asihacker

I have been following and learning your project

asihacker avatar Aug 03 '22 06:08 asihacker

come on.

“come on”It means encouraging you

asihacker avatar Aug 03 '22 06:08 asihacker

come on.

? I have been following and learning your project

“come on”It means encouraging you

asihacker avatar Aug 03 '22 06:08 asihacker

I've overcome my writer's block on the user guide and the work is proceeding quite well. Several sections have already been completed, and I'm hopeful that I can get more work done on it tomorrow.

On another note, another prospective user was asking me for assistance experimenting on v4.0 and in the course of lending that assistance, I remembered that right now, job results are always stored in the data store so a schedule would quickly fill the data store with results that nobody cares about. To fix this, I will have to add a new job parameter that specifies an expiration time for the results. A schedule should create jobs whose results are never saved, but that raises another question: how to deal with exceptions occurring when the jobs are run? Should those be implicitly logged by the worker? What about job completion events? Should those contain the exception object too (I'm concerned about what happens if serialization fails)? I'd be grateful for your opinions and insights.

agronholm avatar Aug 06 '22 22:08 agronholm

Personally I always want a job exception to be logged by the worker by default. That's leaving a nice "paper trail" for me to start hunting down why my jobs failed.

I think job completion events sure can contain the exception as well. I'm not fully sure of the implications of this, but it may be nice to have this available. I've logged job exceptions to the database or done other things with them upon job completion handler (like start a different job), so it's nice to have an exception object there. If serialization fails, once again, a nice log message stating that serialization was not successful while also logging the job exception seems sufficient

rogueinkamp avatar Aug 08 '22 02:08 rogueinkamp

In my experience with other schedulers, IMHO I find this paradigm tends to work well:

  • Store all job results in the datastore. This is happening already I believe.

  • Automatically purge all job results with the same expiration, regardless of their result, with a relatively-short default expiration. This lets users of non-memory job stores "explore" the results for whatever reason, either programmatically or with DBMS tooling.

  • Log job exceptions to STDERR along with the backtrace to support basic debugging.

  • Include the string representation of a job exception in the datastore, but avoid serializing the entire exception object, because whether or not that is a good idea depends on the backend store and what's in the exception.

  • Support more-advanced job exception handling via an exception handler that has access to the exception object. Let users do something more complicated with the result if necessary (e.g. pass the error to Sentry or some other logger). This is possible already but perhaps there could be an easier way to configure the handler and it should be documented.

I don't think I'm familiar enough with the code to contribute to any of the above, but I'm happy to test more and try to help with the documentation.

courtland avatar Aug 08 '22 14:08 courtland