airflow icon indicating copy to clipboard operation
airflow copied to clipboard

[3.0.1] Distributed DAG Processing - Worker PATCH to /api/v2/task-instances/{id}/run fails with 405 Method Not Allowed

Open Pandelo0398 opened this issue 5 months ago • 5 comments

Apache Airflow version

3.0.1

If "Other Airflow 2 version" selected, which one?

No response

What happened?

Apache Airflow version: 3.0.1

Executor: CeleryExecutor (distributed workers, multi-host setup)

Deployment method: Docker Compose (official apache/airflow:3.0.1 images)

Summary: When running a distributed Airflow 3.0.1 setup with CeleryExecutor and remote workers, my worker(s) consistently fail with a 405 Method Not Allowed error. The worker tries to PATCH to /api/v2/task-instances/{id}/run, but this endpoint is not implemented/exposed in the REST API, causing all DAG tasks to immediately fail.

Repro steps:

Deploy Airflow 3.0.1 central node (webserver, scheduler, API server) and remote worker nodes via Docker Compose. All use the official images, no custom pip install.

Trigger a DAG run.

Worker logs show:

´´´ ServerResponseError: Method Not Allowed airflow.sdk.api.client.Client.patch("task-instances/{id}/run", ...)

´´´

Detailed logs and stacktrace:

´´´ /home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/api/client.py:146 in start resp = self.client.patch(f"task-instances/{id}/run", content=body.model_dump_json()) ... ServerResponseError: Method Not Allowed

´´´

What you think should happen instead?

Remote worker should be able to communicate and update the state via the API without 405 errors.

DAGs should run to completion.

What happened instead?

Every task execution fails with the above 405 error.

The worker is unable to PATCH to a non-existent endpoint.

Images used: apache/airflow:3.0.1 (both central node and worker)

Steps already tried:

Checked for custom pip installs (none).

All images use the official version, verified with docker image ls.

Restarted from scratch, wiped all volumes.

Searched the docs and API spec: PATCH /api/v2/task-instances/{id}/run does not exist.

Possible cause (hypothesis):

Internal SDK/API client in the worker references endpoints not exposed in the public API, or the distributed DAG processing implementation is incomplete for this executor setup.

There might be a version mismatch or a bug in the distributed execution support.

Is this a regression? Unknown. Did not test with Airflow <3.0.1 distributed DAG processing.

How to reproduce

How to Reproduce the Issue Environment Setup Prepare a distributed Airflow 3.0.1 environment using the official Docker images.

Use CeleryExecutor.

Set up Redis as the Celery broker.

Set up PostgreSQL as the metadata database.

The environment must have:

A central node (running webserver, scheduler, and API server).

At least one remote worker node (only running Celery worker).

Use Docker Compose (attach your docker-compose.yml if possible).

Start the services in this order:

docker-compose up -d postgres redis

Wait for DB and Redis to be healthy.

docker-compose up airflow-init

docker-compose up -d airflow-api-server airflow-scheduler

Start the Celery worker on a separate node/container, using the same DB and Redis.

Operating System

Ubuntu 22

Versions of Apache Airflow Providers

Airflow 3.0.1

Deployment

Docker-Compose

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

Pandelo0398 avatar May 30 '25 09:05 Pandelo0398

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

boring-cyborg[bot] avatar May 30 '25 09:05 boring-cyborg[bot]

I am facing the same issue as described above. The uvicorn logs show the following error:

"PATCH /{base_url}/task-instances/01972a55-7d65-7d38-90c4-e29aaf940fae/run HTTP/1.1" 405 Method Not Allowed

Here is the stack dump of the api-server where 405 is thrown:

  File "<string>", line 1, in <module>
  File "/usr/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/usr/lib/python3.10/multiprocessing/spawn.py", line 129, in _main
    return self._bootstrap(parent_sentinel)
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/_subprocess.py", line 80, in subprocess_started
    target(sockets=sockets)
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/supervisors/multiprocess.py", line 63, in target
    return self.real_target(sockets)
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/server.py", line 66, in run
    return asyncio.run(self.serve(sockets=sockets))
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/base.py", line 141, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 720, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 751, in app
    await partial.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 285, in handle

Furthermore, in routing.py:handle, self.methods is {'GET'} and scope["method"] is PATCH

Tried to see how the route /task-instances/{id}/run is setup in task_instances.py and seems to be correct. Appreciate your help with this.

boudey avatar Jun 01 '25 07:06 boudey

Hi @Pandelo0398 I think the issue is that execution_api_server_url is not configured correctly. If you overload it:

  1. It must end with execution
  2. It must not include api/v2

So a good execution_api_server_url is http://my-remote-server/my-base/execution assuming of course that my-base is part of the api-server base url. A bad one is http://my-remote-server/my-base or http://my-remote-server/api/v2

boudey avatar Jun 01 '25 15:06 boudey

We are experencing the same issue. We're following the Docker setup in this official guide (https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html), with CeleryExecutor. We have added the new config field [core].execution_api_server_url and set the value to http://airflow-apiserver:8080/execution/. I can verify that the address is correct, but the PATCH method is not allowed.

Logs from the API-server:

Image

Currently on Airflow 3.0.2

larsstromholm avatar Jun 16 '25 06:06 larsstromholm

Facing same issue here. There are enough details here already, so not going to put any more logs. Would love to have this fixed asap please.

bhoang avatar Jun 17 '25 17:06 bhoang

Same issue with kubernetes executor!

It works when updating the execution_api_server_url into http://airflow-api-server:8080/execution/ instead of http://airflow-api-server:8080

yahiaqous avatar Jun 19 '25 13:06 yahiaqous

I had the same issue on 3.0.2. It started working properly after I changed the execution_api_server_url from http://airflow-apiserver:8080/execution/ to http://airflow-apiserver:8080/airflow/execution/

daijoa avatar Jul 02 '25 06:07 daijoa

If its open, i like to work on it

mandeepzemo avatar Aug 05 '25 11:08 mandeepzemo

Hey, I'm commenting here in case anyone has the same problems that I had. I did an upgrade from 2.10 to 3.0.4. Redid the Python Virtual Environment, but just migrated my Postgres DB. I also redid the configuration file starting from the new default. I keep most of the configuration default. But, if you set a custom port in [api] section, then it's important to set up also base_url specifying the new port. For example:

# The port on which to run the api server
#
# Variable: AIRFLOW__API__PORT
#
# port = 8080
port = 8081

Then you must set up:

# The base url of the API server. Airflow cannot guess what domain or CNAME you are using.
# If the Airflow console (the front-end) and the API server are on a different domain, this config
# should contain the API server endpoint.
#
# Example: base_url = https://my-airflow.company.com
#
# Variable: AIRFLOW__API__BASE_URL
#
base_url = http://localhost:8081/

And leave execution_api_server_url default.

When I set up execution_api_server_url = http://localhost:8081/execution or execution_api_server_url = http://localhost:8081/airflow/execution, without setting up base_url, I got the airflow.sdk.api.client.ServerResponseError: Method Not Allowed for every task, which failed immediately with no logs shown. If I only set the port, I get httpcore.RemoteProtocolError: Server disconnected without sending a response. with the same behavior (tasks fail immediately with no logs).

ddar0ch avatar Aug 16 '25 20:08 ddar0ch

@Pandelo0398 can you try https://github.com/apache/airflow/issues/51235#issuecomment-3193891815

vatsrahul1001 avatar Sep 02 '25 05:09 vatsrahul1001

This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

github-actions[bot] avatar Sep 29 '25 00:09 github-actions[bot]

This issue has been closed because it has not received response from the issue author.

github-actions[bot] avatar Oct 08 '25 00:10 github-actions[bot]