ngx_wasm_module icon indicating copy to clipboard operation
ngx_wasm_module copied to clipboard

Parallel dispatch calls freeze when Lua resolver yields

Open hishamhm opened this issue 1 year ago • 7 comments

Problem summary: If you dispatch two parallel HTTP calls, and both call into Lua for DNS resolution, and both yield to wait for the DNS response, only the second one gets a response. The first one waits forever.

Background and some observations of the behavior

This has been observed first with Kong Gateway, but it is also reproducible in this module's test suite (see test cases in PR #523).

Using two different hostnames allows us to see in the logs that the second DNS request is the one that gets a response.

Yielding affects the behavior:

  • if you issue a request that triggers two parallel calls for the same hostname, it hangs because the first DNS request yields and never resumes. If you Ctrl-C the request and try again, the overall request succeeds, because the hostname was cached from the successful call of the previous request, and both succeed.
  • if you try to do the same with a request that triggers two parallel calls for the different hostnames, the first request gets the same behavior (it hangs because the first DNS request yields and never resumes), but the second request hangs again, because it will try the first hostname which never resolved, yields, then tries the second hostname which is cached, and succeeds, but the overall request is still waiting for the first hostname.

In short, it seems that triggering the Lua resolver a second time discards the pending DNS resolution coroutine.

Logs

Enabling "debug_mail" logs on the Kong Gateway showed some interesting bits which may provide some clues. Here's what I've observed:

First we create a new Lua thread to resolve gobolinux.org, all is normal so far:

2024/03/22 16:25:45 [debug] 90985#0: *2 lua creating new thread
2024/03/22 16:25:45 [debug] 90985#0: lua ref lua thread 00005A6A986C62C0 (ref 649)
2024/03/22 16:25:45 [debug] 90985#0: *2 code cache lookup (key='wasm_lua_resolver_chunkdd7007ea3c1caf68390921fc5ddb939c', ref=0)
2024/03/22 16:25:45 [debug] 90985#0: *2 code cache miss (key='wasm_lua_resolver_chunkdd7007ea3c1caf68390921fc5ddb939c', ref=0)
2024/03/22 16:25:45 [debug] 90985#0: *2 wasm running lua thread (lctx: 00005A6A96928E80, L: 00005A6A95FFCD30, co: 00005A6A986C62C0)
2024/03/22 16:25:45 [debug] 90985#0: *2 lua reset ctx
2024/03/22 16:25:45 [debug] 90985#0: *2 http lua finalize threads
2024/03/22 16:25:45 [debug] 90985#0: *2 lua run thread, top:0 c:1
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] [string "wasm_lua_resolver_chunk"]:13: wasm lua resolver thread
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] [string "wasm_lua_resolver_chunk"]:20: wasm lua resolver using existing dns_client
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] [string "wasm_lua_resolver_chunk"]:42: wasm lua resolving "gobolinux.org"

This causes the DNS client to trigger the UDP message for DNS resolution, epoll fd 34. It does not get an immediate response, so it yields:

2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket network address given directly
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket resolve retval handler
2024/03/22 16:25:45 [debug] 90985#0: *2 UDP socket 34
2024/03/22 16:25:45 [debug] 90985#0: *2 connect to 127.0.0.53:53, fd:34 #646
2024/03/22 16:25:45 [debug] 90985#0: *2 epoll add event: fd:34 op:1 ev:80002001
2024/03/22 16:25:45 [debug] 90985#0: *2 add cleanup: 00005A6A964C8870
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket connect: 0
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] client.lua:737: lookup(): WILL QUERY gobolinux.org
2024/03/22 16:25:45 [debug] 90985#0: *2 send: fd:34 31 of 31
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket calling receive() method
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket read timeout: 2000
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket receive buffer size: 4096
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket read data: waiting: 0
2024/03/22 16:25:45 [debug] 90985#0: *2 recv: fd:34 -1 of 4096
2024/03/22 16:25:45 [debug] 90985#0: *2 recv() not ready (11: Resource temporarily unavailable)
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp recv returned -2
2024/03/22 16:25:45 [debug] 90985#0: *2 event timer add: 34: 2000:10471706
2024/03/22 16:25:45 [debug] 90985#0: *2 lua resume returned 1
2024/03/22 16:25:45 [debug] 90985#0: *2 lua thread yielded

(we then get this which I don't if it's at all related)

2024/03/22 16:25:45 [debug] 90985#0: posted event 00005A6A9868DF30
2024/03/22 16:25:45 [debug] 90985#0: *2 delete posted event 00005A6A9868DF30

what follows immediately is the next dispatch, which needs resolving example.com:

2024/03/22 16:25:45 [debug] 90985#0: *2 proxy_wasm http dispatch connecting...
2024/03/22 16:25:45 [debug] 90985#0: *2 wasm tcp socket resolving: example.com
2024/03/22 16:25:45 [debug] 90985#0: *2 wasm tcp socket using default resolver
2024/03/22 16:25:45 [debug] 90985#0: *2 wasm tcp socket resolving...
2024/03/22 16:25:45 [debug] 90985#0: *2 lua creating new thread
2024/03/22 16:25:45 [debug] 90985#0: lua ref lua thread 00005A6A9678DC20 (ref 650)
2024/03/22 16:25:45 [debug] 90985#0: *2 code cache lookup (key='wasm_lua_resolver_chunkdd7007ea3c1caf68390921fc5ddb939c', ref=0)
2024/03/22 16:25:45 [debug] 90985#0: *2 code cache miss (key='wasm_lua_resolver_chunkdd7007ea3c1caf68390921fc5ddb939c', ref=0)
2024/03/22 16:25:45 [debug] 90985#0: *2 wasm running lua thread (lctx: 00005A6A97EE2D90, L: 00005A6A95FFCD30, co: 00005A6A9678DC20)

but unlike the dispatch above where the three lines lua reset ctx, http lua finalize threads and lua run thread were next to each other, we get instead this:

2024/03/22 16:25:45 [debug] 90985#0: *2 lua reset ctx
2024/03/22 16:25:45 [debug] 90985#0: *2 http lua finalize threads
2024/03/22 16:25:45 [debug] 90985#0: *2 lua finalize socket
2024/03/22 16:25:45 [debug] 90985#0: *2 lua close socket connection
2024/03/22 16:25:45 [debug] 90985#0: *2 event timer del: 34: 10471706
2024/03/22 16:25:45 [debug] 90985#0: *2 reusable connection: 0
2024/03/22 16:25:45 [debug] 90985#0: *2 lua deleting light thread 00005A6A986C62C0 (ref 649)
2024/03/22 16:25:45 [debug] 90985#0: *2 lua caching unused lua thread 00005A6A986C62C0 (ref 649)
2024/03/22 16:25:45 [debug] 90985#0: *2 lua run thread, top:0 c:1

Looks like it deleted coroutine ending-62C0, which was the one from the gobolinux.org DNS resolution above!?

The query for example.com then proceeds is a similar fashion to the previous one... (even with the same fd:34)

2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] [string "wasm_lua_resolver_chunk"]:13: wasm lua resolver thread
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] [string "wasm_lua_resolver_chunk"]:20: wasm lua resolver using existing dns_client
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] [string "wasm_lua_resolver_chunk"]:42: wasm lua resolving "example.com"
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] [string "wasm_lua_resolver_chunk"]:44: has individual_toip: function: 0x5a6a962ef8f0
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket network address given directly
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket resolve retval handler
2024/03/22 16:25:45 [debug] 90985#0: *2 UDP socket 34
2024/03/22 16:25:45 [debug] 90985#0: *2 connect to 127.0.0.53:53, fd:34 #647
2024/03/22 16:25:45 [debug] 90985#0: *2 epoll add event: fd:34 op:1 ev:80002001
2024/03/22 16:25:45 [debug] 90985#0: *2 add cleanup: 00005A6A96281D30
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket connect: 0
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] client.lua:737: lookup(): WILL QUERY example.com
2024/03/22 16:25:45 [debug] 90985#0: *2 send: fd:34 29 of 29

...but then after a while we do get data coming from fd:34 and we get a query result for example.com.

2024/03/22 16:25:45 [debug] 90985#0: epoll timer: 837
2024/03/22 16:25:45 [debug] 90985#0: epoll: fd:34 ev:0001 d:00005A6A9888E191
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket handler for "/anything?", wev 0
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket read handler
2024/03/22 16:25:45 [debug] 90985#0: *2 event timer del: 34: 10471706
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket read data: waiting: 1
2024/03/22 16:25:45 [debug] 90985#0: *2 recv: fd:34 85 of 4096
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp recv returned 85
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket waking up the current request
2024/03/22 16:25:45 [debug] 90985#0: *2 wasm rctx reused: 00005A6A964161C0 (r: 00005A6A97B8D430, main: 1)
2024/03/22 16:25:45 [debug] 90985#0: *2 wasm wev handler "/anything?" - timeout: 0, ready: 1 (main: 1, count: 1, resp_finalized: 0, state: 2)
2024/03/22 16:25:45 [debug] 90985#0: *2 wasm resuming lua thread (lctx: 00005A6A97EE2D90, L: 00005A6A95FFCD30, co: 00005A6A9678DC20)
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp operation done, resuming lua thread
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket calling prepare retvals handler 00005A6A943F1684, u:00005A6A98CEEED0
2024/03/22 16:25:45 [debug] 90985#0: *2 lua udp socket receive return value handler
2024/03/22 16:25:45 [debug] 90985#0: *2 lua run thread, top:0 c:1
2024/03/22 16:25:45 [debug] 90985#0: *2 [lua] client.lua:739: lookup(): QUERY GOT RESULT

hishamhm avatar Mar 26 '24 15:03 hishamhm

Interesting: running #523 on CI with Valgrind produced the following after the test cases timed out:

https://github.com/Kong/ngx_wasm_module/actions/runs/8438836263/job/23111996715

 ==46021== 556 (48 direct, 508 indirect) bytes in 1 blocks are definitely lost in loss record 146 of 173
==46021==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==46021==    by 0x1E464A: ngx_alloc (ngx_alloc.c:22)
==46021==    by 0x1ACD49: ngx_create_pool (ngx_palloc.c:21)
==46021==    by 0x3ACBA9: ngx_wasm_lua_thread_new (ngx_wasm_lua.c:128)
==46021==    by 0x3AE14A: ngx_wasm_lua_resolver_resolve (ngx_wasm_lua_resolver.c:145)
==46021==    by 0x391B02: ngx_wasm_socket_tcp_connect (ngx_wasm_socket_tcp.c:406)
==46021==    by 0x3B7A0F: ngx_http_proxy_wasm_dispatch_resume_handler (ngx_http_proxy_wasm_dispatch.c:734)
==46021==    by 0x3B8628: ngx_http_proxy_wasm_dispatch_handler (ngx_http_proxy_wasm_dispatch.c:467)
==46021==    by 0x1DEB89: ngx_event_process_posted (ngx_event_posted.c:35)
==46021==    by 0x1DE086: ngx_process_events_and_timers (ngx_event.c:273)
==46021==    by 0x1ED637: ngx_single_process_cycle (ngx_process_cycle.c:323)
==46021==    by 0x1A8038: main (nginx.c:384)
==46021== 
{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   match-leak-kinds: definite
   fun:malloc
   fun:ngx_alloc
   fun:ngx_create_pool
   fun:ngx_wasm_lua_thread_new
   fun:ngx_wasm_lua_resolver_resolve
   fun:ngx_wasm_socket_tcp_connect
   fun:ngx_http_proxy_wasm_dispatch_resume_handler
   fun:ngx_http_proxy_wasm_dispatch_handler
   fun:ngx_event_process_posted
   fun:ngx_process_events_and_timers
   fun:ngx_single_process_cycle
   fun:main
}

...so it seems that an lctx is indeed being lost along the way :thinking:

hishamhm avatar Mar 26 '24 16:03 hishamhm

Thanks for the test case! Long story short, yes as by the Valgrind report, it appears our Lua bridge cannot handle more than one "Lua thread" at a time (lctx).

At first I enabled some of the DDEBUG macros and ran the test as-is. I commented the error.log output here, which shows what is currently going on. "posted event" followed by "delete posted event" is not of concern, since its part of ngx_event_process_posted. The logs showed the issue was with a call to ngx_http_lua_reset_ctx in the Lua bridge.

I then commented-out the lua_reset_ctx call which prevented the waiting file descriptor for call 1 from being prematurely closed, but also showed that call 1 was never resumed, even after the Lua thread of call 2 had yielded and resumed all the way through. It then became apparent that rctx only supports a single lctx object, which is problematic since the http module resumes yielded Lua threads via its single lctx reference.

Maybe we should make lctx a single, long-lived Lua thread that schedules smaller co-routines? Which I hope we can do with the exposed OpenResty facilities. We could also make rctx store a queue of Lua threads instead of a single one, and resume all threads until its empty. This could seem to be the underlying cause of the behavior we've seen with the Gateway's DNS semaphore not being released, although a single Lua thread is supposed to be resumed all the way through, so I am not sure.

thibaultcha avatar Mar 27 '24 01:03 thibaultcha

Thanks for taking a look at it!

I commented the error.log output here, which shows what is currently going on.

Good, so I think I was looking at the right thing: that # fd:12 closed (!!) is the same block as Looks like it deleted coroutine ending-62C0 in the annotation above.

We could also make rctx store a queue of Lua threads instead of a single one, and resume all threads until its empty.

Something along those lines was the first thing that came to my mind when I saw the Valgrind output. I'll give that a try!

hishamhm avatar Mar 27 '24 14:03 hishamhm

We could also make rctx store a queue of Lua threads instead of a single one, and resume all threads until its empty.

Something along those lines was the first thing that came to my mind when I saw the Valgrind output. I'll give that a try!

I did try this approach: https://github.com/Kong/ngx_wasm_module/tree/wip/lctxs-queue

I managed to collect all lctxs and resume them, but it doesn't quite work, because the socket data did not arrive correctly.

What is happening is:

  • when our Lua chunk issues a DNS request, resty/dns/resolver.lua yields at line 871 (buf, err = sock:receive(4096))
  • when the response from the DNS server arrives, this triggers the ngx_http_wasm_wev_handler, in which I go through my queue, waking up the yielded lctxs.
  • what happens then is that sock:receive() returns 0 bytes (which then confuses resty.dns.resolver and produces a "truncated" response error)

Apparently, the socket data connection is only plugged correctly to the most-recently-created lctx? I don't fully understand how data is supposed to flow there.

Maybe we should make lctx a single, long-lived Lua thread that schedules smaller co-routines?

If there is no way to map incoming data to the correct lctx at the "wev_handler", then that would be the way to go. But then, OpenResty does have its own socket scheduler, right? (i.e. it implements the equivalents of both luasocket and copas). Not sure how we would be able to put another scheduling layer on top of that. Lua coroutine schedulers are not composable out-of-the-box...

hishamhm avatar Mar 27 '24 18:03 hishamhm

At first I noticed that in ngx_http_lua_socket_udp, the response length is stored in u->received which is set to 111 after ngx_udp_recv, but later in ngx_http_lua_socket_udp_receive_retval_handler it ends up being 0 (which bubbles up to sock:receive() returning ""). The value of u->received changed because u changed in-between the recv event and the udp_receive_retval_handler:

  2024/03/27 22:25:20 [debug] 1695661#0: *1 lua udp socket read handler
  2024/03/27 22:25:20 [debug] 1695661#0: *1 event timer del: 12: 155591122
  2024/03/27 22:25:20 [debug] 1695661#0: *1 lua udp socket read data: waiting: 1
  2024/03/27 22:25:20 [debug] 1695661#0: *1 recv: fd:12 111 of 4096
  2024/03/27 22:25:20 [debug] 1695661#0: *1 lua udp recv returned 111
1. ---> u is correct lua *** ngx_http_lua_socket_udp_read: u->received: 111 (u: 0x7f5a8569d9c0) at ../ngx_lua-0.10.26/src/ngx_http_lua_socket_udp.c line 1158.
  2024/03/27 22:25:20 [debug] 1695661#0: *1 lua udp socket waking up the current request
  2024/03/27 22:25:20 [debug] 1695661#0: *1 rewrite phase: 3
  ngx_http_wasm_rewrite_handler: enter at /home/code/ngx_wasm_module/src/http/ngx_http_wasm_module.c line 672.
  2024/03/27 22:25:20 [debug] 1695661#0: *1 wasm rctx reused: 000055CFADA90BB8 (r: 000055CFADA8FE40, main: 1)
  2024/03/27 22:25:20 [debug] 1695661#0: *1 any lua yielded? 1
  ngx_http_wasm_rewrite_handler: wasm lua forcing content wev handler at /home/code/ngx_wasm_module/src/http ngx_http_wasm_module.c line 685.
  ngx_http_wasm_wev_handler: enter at /home/code/ngx_wasm_module/src/http/ngx_http_wasm_module.c line 994.
  2024/03/27 22:25:20 [debug] 1695661#0: *1 wasm rctx reused: 000055CFADA90BB8 (r: 000055CFADA8FE40, main: 1)
  2024/03/27 22:25:20 [debug] 1695661#0: *1 wasm wev handler "/t?" - timeout: 0, ready: 1 (main: 1, count: 1, resp_finalized: 0, state: 0)
  ngx_http_wasm_wev_handler: will resume lua thread (lctx: 0x55cfad7d9eb0) at /home/code/ngx_wasm_module/src/http/ngx_http_wasm_module.c line 1025.
  2024/03/27 22:25:20 [debug] 1695661#0: *1 wasm resuming lua thread (lctx: 000055CFAD7D9EB0, L: 00007F5C8681A380, co: 00007F5C867E3DA8)
  ngx_http_wasm_set_resume_handler: resume_handler = ngx_http_core_run_phases (r: 0x55cfada8fe40, rctx: 0x55cfada90bb8) at /home/code/ngx_wasm_module/src/http/ngx_http_wasm_module.c line 1119.
  2024/03/27 22:25:20 [debug] 1695661#0: *1 lua udp operation done, resuming lua thread
  2024/03/27 22:25:20 [debug] 1695661#0: *1 lua udp socket calling prepare retvals handler 000055CFAC3A0954, u:00007F5C867D5EF0
  2024/03/27 22:25:20 [debug] 1695661#0: *1 lua udp socket receive return value handler
  lua *** ngx_http_lua_socket_udp_receive_retval_handler: buffer: <CB>Q<81><80> at ../ngx_lua-0.10.26/src/ngx_http_lua_socket_udp.c line 1031.
2. ---> u should be 0x7f5a8569d9c0 lua *** ngx_http_lua_socket_udp_receive_retval_handler: u->received: 0 (u: 0x7f5c867d5ef0) at ../ngx_lua-0.10.26/src/ngx_http_lua_socket_udp.c line 1032.
  lua *** dumpstack: top: 1 at ../ngx_lua-0.10.26/src/ngx_http_lua_util.h line 738.
3. ---> we get "" on top of the stack with luaL_pushlstring(L, buf, 0) lua *** dumpstack: 1    string  "" at ../ngx_lua-0.10.26/src/ngx_http_lua_util.h line 747.
  2024/03/27 22:25:20 [debug] 1695661#0: *1 lua run thread, top:9 c:1
  lua *** ngx_http_lua_run_thread: ctx: 0x55cfad91ecd0, co: 0x7f5c867e9708, co status: 0, co is_wrap: 0 at ../ngx_lua-0.10.26/src/ngx_http_lua_util.c line 1150.
  lua *** ngx_http_lua_run_thread: 0x7f5c867e9708: saved co top: 0, nrets: 1, true top: 1 at ../ngx_lua-0.10.26/src/ngx_http_lua_util.c line 1162.
  lua *** ngx_http_lua_run_thread: co top elem: string at ../ngx_lua-0.10.26/src/ngx_http_lua_util.c line 1170.
  2024/03/27 22:25:20 [notice] 1695661#0: *1 [lua] resolver.lua:633: parse_response(): buf:  len: 0, client: 127.0.0.1, server: localhost, request: "GET /t HTTP/1.1", host: "localhost"
  2024/03/27 22:25:20 [debug] 1695661#0: *1 [lua] resolver.lua:797: _tcp_query(): query the TCP server due to reply truncation

Since u is stored in coctx->data and since the coctx of all our Lua threads comes from &ctx->entry_co_ctx, I believe that we are currently scheduling multiple udp_socks in the same coroutine context and at the same time, oops. As you said it is not plugged in correctly. Now I believe that we are not initializing our coroutines correctly. We should look closer at facilities that create coroutines in OpenResty like ngx_http_lua_uthread.c, ngx_http_lua_coroutine.c, and ngx_http_lua_coroutine_create_helper + ngx_http_lua_post_thread. If we instantiate our own coroutine + coroutine contexts, and attach them to ctx->cur_co_ctx (or &ctx->entry_co_ctx), we should get closer to what we are trying to do...

thibaultcha avatar Mar 28 '24 05:03 thibaultcha

Now I believe that we are not initializing our coroutines correctly.

Yep, that's what I feared.

We should look closer at facilities that create coroutines in OpenResty like ngx_http_lua_uthread.c, ngx_http_lua_coroutine.c, and ngx_http_lua_coroutine_create_helper + ngx_http_lua_post_thread.

This is going to be quite the journey. I've always been uncomfortable with that part of the codebase not knowing what exactly the various ctx-this-and-that variables exactly mean; about time I build a proper understanding of those bits. I'm going to deep dive into this, since it's a blocker for the other stuff I was doing. Thanks for the pointers!

hishamhm avatar Mar 28 '24 14:03 hishamhm

I think we will figure it out; I made good progress this morning, so far both calls resume with the proper coctx locally (and proper socket + u), just need to tie some lose ends.

thibaultcha avatar Mar 28 '24 16:03 thibaultcha