apisix icon indicating copy to clipboard operation
apisix copied to clipboard

feat: Add dependency protocol checking and deletion checking for stream routing

Open yixianOu opened this issue 1 month ago • 2 comments

Description

This PR implements dependency protocol checking and deletion protection for stream routes, resolving #6939.

Previously, it was possible to create a subordinate stream route referencing a non-existent superior_id or a superior route with a mismatched protocol. Additionally, superior routes could be deleted even while being referenced by subordinate routes, leading to potential runtime errors.

Checklist

  • [x] I have read the CONTRIBUTING guidelines.
  • [x] I have added tests to cover my changes.
  • [x] I have verified that the changes work as expected.

Changes

  1. Dependency Checking: Added validation in check_conf to ensure that when a superior_id is provided:
    • The referenced superior route exists.
    • The protocol of the subordinate route matches the superior route.
  2. Deletion Protection: Implemented a delete_checker to prevent the deletion of a stream route if it is currently referenced as a superior_id by other routes.
  3. Tests: Added a new test file t/admin/stream-routes-subordinate.t covering various scenarios including creation, protocol mismatch, and deletion protection.

Related Issue

Resolves #6939

Sign-off

Signed-off-by: Orician [email protected]

yixianOu avatar Dec 06 '25 13:12 yixianOu

Modified t/xrpc/pingpong.t to expect a 400 error code when creating the route with superior_id = 10000. This aligns the test with the new validation logic. Added cleanup test to avoid residual configuration in etcd affecting subsequent redis tests. For mcp-bridge testing, I don't know how to fix it.

yixianOu avatar Dec 08 '25 14:12 yixianOu

httpbin.org:80 Is this external service reliable? My local development container performs tests, this service sometimes times out, sometimes returns 503, sometimes it can be accessed normally and passes the test, I don't know if this is my local network problem or this external service problem.

yixianOu avatar Dec 09 '25 13:12 yixianOu

  1. jwe-decrypt.t
  • Change: Updated TEST 26 assertion to match upstream Authorization header when echoed as an array, i.e., allowing "Authorization": ["hello"].
  • Reason: httpbin’s /headers returns headers as arrays; the previous string-only regex caused false test failures. This makes the test compatible with the actual upstream response format.
  1. stream-routes-subordinate.t
  • Change: Switched failure checks to decode the admin API JSON and assert on the error_msg field (e.g., “failed to fetch stream routes[999]”, “protocol mismatch”, “can not delete this stream route”).
  • Reason: Using structured error_msg ensures precise, stable validation, avoiding brittle raw-string matching and aligning tests with the API’s canonical error payload format.
  1. ai-request-rewrite2.t
  • Change: removed manual cleanup steps embedded in the tests.
  • Reason: Reduce flakiness and dependency on manual teardown.

CI test failure occurred in the job: build (ubuntu-latest, linux_apisix_current_luarocks_in_customed_nginx)

Error log details:

Error: 2025/12/12 06:10:14 [error] 95223#95223: *1 [lua] config_etcd.lua:602: load_full_data(): failed to check item data of [/apisix/routes] err:object matches none of the required: ["plugins","uri"] or ["upstream","uri"] or ["upstream_id","uri"] or ["service_id","uri"] or ["plugins","uris"] or ["upstream","uris"] or ["upstream_id","uris"] or ["service_id","uris"] or ["script","uri"] or ["script","uris"] ,val: {"priority":0,"status":1,"uri":"/2"}, context: init_worker_by_lua*

Root cause analysis: The failure is triggered by an invalid route object persisted in etcd: {"priority":0,"status":1,"uri":"/2"}. This route object fails schema validation because it lacks mandatory field combinations — per the routing schema requirements, at least one of [plugins, upstream, upstream_id, service_id, script] must be paired with either uri or uris. The schema check failure in load_full_data() throws the "object matches none of the required" error, causing the worker initialization process to terminate immediately and resulting in CI pipeline failure.

The CLI test script t/cli/test_etcd_sync_event_handle.sh is causing CI failures in the linux_apisix_current_luarocks_in_customed_nginx job due to incompatibility with current APISIX schema validation behavior during worker initialization.

issue: #12815

yixianOu avatar Dec 13 '25 10:12 yixianOu

During the last CI test run, I noticed that the jwe-decrypt.t test failed. The reason was that the Authorization field in the response headers had a value of "Authorization": ["hello"].

#   Failed test 't/plugin/jwe-decrypt.t TEST 26:  verify in upstream header - response_body_like - response is expected ({
#   "headers": { "Authorization": [ "hello" ], "Host": [ "localhost"
#   ], "X-Forwarded-For": [ "127.0.0.1" ], "X-Forwarded-Host": [
#   "localhost" ], "X-Forwarded-Port": [ "1984" ],
#   "X-Forwarded-Proto": [ "http" ], "X-Real-Ip": [ "127.0.0.1" ] }
#   })'
#   at /usr/local/share/perl/5.38.2/Test/Nginx/Socket.pm line 1706.
#                   '{
#   "headers": {
#     "Authorization": [
#       "hello"
#     ],
#     "Host": [
#       "localhost"
#     ],
#     "X-Forwarded-For": [
#       "127.0.0.1"
#     ],
#     "X-Forwarded-Host": [
#       "localhost"
#     ],
#     "X-Forwarded-Port": [
#       "1984"
#     ],
#     "X-Forwarded-Proto": [
#       "http"
#     ],
#     "X-Real-Ip": [
#       "127.0.0.1"
#     ]
#   }
# }
# '
#     doesn't match '(?^s:.*"Authorization": "hello".*
# )'
# Looks like you failed 2 tests of 154.
# [08:26:06] t/plugin/jwe-decrypt.t ..
# Dubious, test returned 2 (wstat 512, 0x200)
# Failed 2/154 subtests

Therefore, I modified the expected response in jwe-decrypt.t to .*"Authorization":\s*\[\s*"hello"\s*\].*.
This change allowed the test to pass successfully on my local machine:

╭───────────────────────────────────────────────────────────────────────────────────╮
│ [08:16:40] t/plugin/jwe-decrypt.t .. ok    12507 ms ( 0.03 usr  0.00 sys +  1.91  │
│ cusr  0.63 csys =  2.57 CPU)                                                      │
│ [08:16:40]                                                                        │
│ All tests successful.                                                             │
│ Files=1, Tests=154, 13 wallclock secs ( 0.06 usr  0.01 sys +  1.91 cusr  0.63     │
│ csys =  2.61 CPU)                                                                 │
│ Result: PASS                                                                      │
│ <exited with exit code 0>                                                         │
╰───────────────────────────────────────────────────────────────────────────────────╯

However, in today's CI test run, jwe-decrypt.t failed again. This time, the Authorization field in the response headers was "Authorization": "hello".

Should we revert the expected response back to .*"Authorization": "hello".*? @Baoyuantop

yixianOu avatar Dec 15 '25 08:12 yixianOu

Hi @yixianOu, without any logical changes, we cannot make tests pass by modifying the test code. All test-related modifications in a PR must revolve around the functionality of the PR itself; irrelevant modifications should not be included.

I addressed a portion of the issues in CI at https://github.com/apache/apisix/pull/12805. Have you merged that part of the code?

Baoyuantop avatar Dec 15 '25 08:12 Baoyuantop

https://github.com/apache/apisix/actions/runs/20190600057/job/58037592121?pr=12794

  • CI log breakdown: 1st run: TEST 20 failed, 23 passed Retry run: All 23 passed but process exited with code 1

  • Could this be a flaky test issue? Likely culprits:

    • etcd connection timeout (saw "connection refused" in logs)
    • Race conditions from network latency
    • Test env init timing problems
  • The files I modified: apisix/admin/stream_routes.lua, t/admin/stream-routes-subordinate.t, t/xrpc/pingpong.t None of these touch Kubernetes discovery—so wouldn’t the K8s test failure be unrelated to my changes?

yixianOu avatar Dec 15 '25 08:12 yixianOu