airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Investigate `Timetable` usage for client-server separation

Open kaxil opened this issue 6 months ago • 3 comments

Analyze current timetable usage in scheduler vs DAG processor. Decide if timetables belong in core, task-sdk, or separate package (airflow-commons/airflow-protocols).

kaxil avatar Jun 09 '25 19:06 kaxil

Timetable is kind of the reverse of e.g. dag and asset. It allows customisation, but all the logis is only needed in the scheduler, not the worker nor the dag processor. The main implementation should therefore likely belong in core. However, non-core timetables (e.g. EventsTimetable) may belong in the standard provider instead.

uranusjr avatar Jun 13 '25 06:06 uranusjr

Timetable is kind of the reverse of e.g. dag and asset. It allows customisation, but all the logis is only needed in the scheduler, not the worker nor the dag processor. The main implementation should therefore likely belong in core. However, non-core timetables (e.g. EventsTimetable) may belong in the standard provider instead.

Doesn't the dag processor use timetable to get the next run time though?

kaxil avatar Jun 13 '25 09:06 kaxil

Oh yes, it does, for the first ever run. I forgot about that. This can be changed if we need to though. The scheduler is responsible for calculating all later runs.

uranusjr avatar Jun 17 '25 07:06 uranusjr

If I use a Custom Timetable, does that code currently run in the Scheduler?

kaxil avatar Jun 23 '25 12:06 kaxil

Yes it does.

uranusjr avatar Jun 24 '25 04:06 uranusjr

So that has to be on Server side?

kaxil avatar Jun 24 '25 08:06 kaxil

Are we still targeting this for 3.1.0?

phanikumv avatar Aug 18 '25 07:08 phanikumv

Yes ideally. @uranusjr ?

kaxil avatar Aug 18 '25 11:08 kaxil

Most of the timetable logic need to be in the scheduler. So we should kind of do the opposite of SerializedBaseOperator.

  1. Existing tiemtable classes in core are scheduler-only.
  2. Create timetable “stub” classes that only takes arguments without any scheduler logic.
  3. The stubs are serialised, and deserialised in the scheduler into their “real” timetable class counterparts.

Custom subclasses are more problematic. I guess for now (Airflow 3.1) we can just make that one single subclass fulfill both roles as a stub and real timetable. Maybe at some point we need to have an Airflow Extension SDK for this, and other things plugin-wise that you can register into the scheduler.

uranusjr avatar Aug 27 '25 05:08 uranusjr