msquic icon indicating copy to clipboard operation
msquic copied to clipboard

Re-connecting is blocked when network changes

Open wanghaEMQ opened this issue 2 years ago • 5 comments

Describe the bug

Msquic is blocked in the SetParam function when I try to re-establish a new connection with the server.

I defined the function QuicConnectionCallback as a callback for Quic connection. When receiving an event QUIC_CONNECTION_EVENT_SHUTDOWN_COMPLETE, reconnect function would be called. In reconnect, I create a new connection with a resumption ticket(Obtain from the server). Sometimes, reconnect is blocked in calling SetParam function. (I turn off/on wifi to simulate network changes.)

_IRQL_requires_max_(DISPATCH_LEVEL)
    _Function_class_(QUIC_CONNECTION_CALLBACK) QUIC_STATUS QUIC_API
QuicConnectionCallback(_In_ HQUIC Connection, _In_opt_ void *Context,
        _Inout_ QUIC_CONNECTION_EVENT *Event)
{
	switch (Event->Type) {
	case QUIC_CONNECTION_EVENT_SHUTDOWN_COMPLETE:
		if (!Event->SHUTDOWN_COMPLETE.AppCloseInProgress) {
			MsQuic->ConnectionClose(Connection);
		}

		if (rticket_active) {
			reconnect(qstrm);
		}
		break;
	default:
		break;
	}
	return QUIC_STATUS_SUCCESS;
}

static int
reconnect(quic_strm_t *qstrm)
{
	if (!LoadConfiguration(TRUE)) {
		return (-1);
	}

	QUIC_STATUS Status;
	HQUIC       Connection             = NULL;

	// Allocate a new connection object.
	if (QUIC_FAILED(Status = MsQuic->ConnectionOpen(Registration,
	                    QuicConnectionCallback, sock_data, &Connection))) {
		goto Error;
	}

	if (rticket_active) {
		// Blocked in SetParam
		if (QUIC_FAILED(Status = MsQuic->SetParam(Connection,
		                    QUIC_PARAM_CONN_RESUMPTION_TICKET, rticket_sz, rticket))) {
			goto Error;
		}
	}

	// Start the connection to the server.
	if (QUIC_FAILED(Status = MsQuic->ConnectionStart(Connection,
	                    Configuration, QUIC_ADDRESS_FAMILY_UNSPEC,
	                    host, atoi(port)))) {
		goto Error;
	}

Error:

	if (QUIC_FAILED(Status) && Connection != NULL) {
		MsQuic->ConnectionClose(Connection);
	}

	return 0;
}

Affected OS

  • [ ] All
  • [ ] Windows Server 2022
  • [ ] Windows 11, version 22H2
  • [ ] Windows 11, version 21H2
  • [ ] Windows Insider Preview (specify affected build below)
  • [X] Ubuntu
  • [ ] Debian
  • [ ] Other (specify below)

Additional OS information

No response

MsQuic version

release/2.0

Steps taken to reproduce bug

  1. Set callback function QuicConnectionCallback for QuicConnection
  2. Calling reconnect when receiving QUIC_CONNECTION_EVENT_SHUTDOWN_COMPLETE event in QuicConnectionCallback.
  3. Turn off wifi and wait reconnect to be called, turn on wifi
  4. if no blocking happened, try more times

Expected behavior

reconnect should works

Actual outcome

blocking in SetParam function

Additional details

Here is the backtrace for all MsQuic threads.

Thread 42 (Thread 0x7fffdf7d7700 (LWP 3558651)):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fffdf7d6a10) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7fffdf7d69c0, cond=0x7fffdf7d69e8) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x7fffdf7d69e8, mutex=mutex@entry=0x7fffdf7d69c0) at pthread_cond_wait.c:647
#3  0x00007ffff723386b in CxPlatInternalEventWaitForever (Event=0x7fffdf7d69c0) at ../nng/extern/msquic/src/inc/quic_platform_posix.h:811
#4  MsQuicSetParam (Handle=0x61f00002f480, Param=<optimized out>, BufferLength=<optimized out>, Buffer=<optimized out>) at ../nng/extern/msquic/src/core/api.c:1405
#5  0x00005555556a9053 in quic_reconnect (qstrm=0x61e000000080) at ../nng/src/supplemental/quic/quic_api.c:585
#6  0x00005555556a7dd3 in QuicConnectionCallback (Connection=0x61f000010a80, Context=0x61d000000fe8, Event=0x7fffdf7d6be0) at ../nng/src/supplemental/quic/quic_api.c:353
#7  0x00007ffff7242702 in QuicConnIndicateEvent (Connection=Connection@entry=0x61f000010a80, Event=Event@entry=0x7fffdf7d6be0) at ../nng/extern/msquic/src/core/connection.c:729
#8  0x00007ffff72454fe in QuicConnOnShutdownComplete (Connection=Connection@entry=0x61f000010a80) at ../nng/extern/msquic/src/core/connection.c:1457
#9  0x00007ffff725f100 in QuicConnDrainOperations (Connection=Connection@entry=0x61f000010a80) at ../nng/extern/msquic/src/core/connection.c:7412
#10 0x00007ffff722b3dd in QuicWorkerProcessConnection (Worker=Worker@entry=0x628000003360, Connection=0x61f000010a80, ThreadID=ThreadID@entry=3558651, TimeNow=TimeNow@entry=0x7fffdf7d6e20) at ../nng/extern/msquic/src/core/worker.c:509
#11 0x00007ffff722c49c in QuicWorkerLoop (Context=Context@entry=0x628000003360, TimeNow=TimeNow@entry=0x7fffdf7d6e20, ThreadID=ThreadID@entry=3558651) at ../nng/extern/msquic/src/core/worker.c:670
#12 0x00007ffff722cc3b in QuicWorkerThread (Context=0x628000003360) at ../nng/extern/msquic/src/core/worker.c:735
#13 0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#14 0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 41 (Thread 0x7fffdffd8700 (LWP 3558650)):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x628000002d44) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x628000002cf0, cond=0x628000002d18) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x628000002d18, mutex=mutex@entry=0x628000002cf0) at pthread_cond_wait.c:647
#3  0x00007ffff722ca5f in CxPlatInternalEventWaitForever (Event=0x628000002cf0) at ../nng/extern/msquic/src/inc/quic_platform_posix.h:811
#4  QuicWorkerThread (Context=0x628000002c30) at ../nng/extern/msquic/src/core/worker.c:738
#5  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 40 (Thread 0x7fffe07d9700 (LWP 3558649)):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x628000002614) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x6280000025c0, cond=0x6280000025e8) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x6280000025e8, mutex=mutex@entry=0x6280000025c0) at pthread_cond_wait.c:647
#3  0x00007ffff722ca5f in CxPlatInternalEventWaitForever (Event=0x6280000025c0) at ../nng/extern/msquic/src/inc/quic_platform_posix.h:811
#4  QuicWorkerThread (Context=0x628000002500) at ../nng/extern/msquic/src/core/worker.c:738
#5  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 39 (Thread 0x7fffe0fda700 (LWP 3558648)):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x628000001ee0) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x628000001e90, cond=0x628000001eb8) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x628000001eb8, mutex=mutex@entry=0x628000001e90) at pthread_cond_wait.c:647
#3  0x00007ffff722ca5f in CxPlatInternalEventWaitForever (Event=0x628000001e90) at ../nng/extern/msquic/src/inc/quic_platform_posix.h:811
#4  QuicWorkerThread (Context=0x628000001dd0) at ../nng/extern/msquic/src/core/worker.c:738
#5  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 38 (Thread 0x7fffe17db700 (LWP 3558647)):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x6280000017b0) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x628000001760, cond=0x628000001788) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x628000001788, mutex=mutex@entry=0x628000001760) at pthread_cond_wait.c:647
#3  0x00007ffff722ca5f in CxPlatInternalEventWaitForever (Event=0x628000001760) at ../nng/extern/msquic/src/inc/quic_platform_posix.h:811
#4  QuicWorkerThread (Context=0x6280000016a0) at ../nng/extern/msquic/src/core/worker.c:738
#5  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 37 (Thread 0x7fffe1fdc700 (LWP 3558646)):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x628000001080) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x628000001030, cond=0x628000001058) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x628000001058, mutex=mutex@entry=0x628000001030) at pthread_cond_wait.c:647
#3  0x00007ffff722ca5f in CxPlatInternalEventWaitForever (Event=0x628000001030) at ../nng/extern/msquic/src/inc/quic_platform_posix.h:811
#4  QuicWorkerThread (Context=0x628000000f70) at ../nng/extern/msquic/src/core/worker.c:738
#5  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 36 (Thread 0x7fffe27dd700 (LWP 3558645)):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x628000000950) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x628000000900, cond=0x628000000928) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x628000000928, mutex=mutex@entry=0x628000000900) at pthread_cond_wait.c:647
#3  0x00007ffff722ca5f in CxPlatInternalEventWaitForever (Event=0x628000000900) at ../nng/extern/msquic/src/inc/quic_platform_posix.h:811
#4  QuicWorkerThread (Context=0x628000000840) at ../nng/extern/msquic/src/core/worker.c:738
#5  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 35 (Thread 0x7fffe2fde700 (LWP 3558644)):
#0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x628000000224) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x6280000001d0, cond=0x6280000001f8) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x6280000001f8, mutex=mutex@entry=0x6280000001d0) at pthread_cond_wait.c:647
#3  0x00007ffff722ca5f in CxPlatInternalEventWaitForever (Event=0x6280000001d0) at ../nng/extern/msquic/src/inc/quic_platform_posix.h:811
#4  QuicWorkerThread (Context=0x628000000110) at ../nng/extern/msquic/src/core/worker.c:738
#5  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 34 (Thread 0x7fffe37df700 (LWP 3558643)):
#0  0x00007ffff707e46e in epoll_wait (epfd=21, events=events@entry=0x7fffe37decb0, maxevents=maxevents@entry=16, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007ffff72c7fe1 in CxPlatDataPathRunEC (Context=Context@entry=0x6190000009f0, CurThreadId=<optimized out>, WaitTime=WaitTime@entry=4294967295) at ../nng/extern/msquic/src/platform/datapath_epoll.c:2753
#2  0x00007ffff72d1d75 in CxPlatWorkerThread (Context=0x619000000970) at ../nng/extern/msquic/src/platform/platform_worker.c:280
#3  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 33 (Thread 0x7fffe3fe0700 (LWP 3558642)):
#0  0x00007ffff707e46e in epoll_wait (epfd=19, events=events@entry=0x7fffe3fdfcb0, maxevents=maxevents@entry=16, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007ffff72c7fe1 in CxPlatDataPathRunEC (Context=Context@entry=0x619000000960, CurThreadId=<optimized out>, WaitTime=WaitTime@entry=4294967295) at ../nng/extern/msquic/src/platform/datapath_epoll.c:2753
#2  0x00007ffff72d1d75 in CxPlatWorkerThread (Context=0x6190000008e0) at ../nng/extern/msquic/src/platform/platform_worker.c:280
#3  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 32 (Thread 0x7fffe47e1700 (LWP 3558641)):
#0  0x00007ffff707e46e in epoll_wait (epfd=17, events=events@entry=0x7fffe47e0cb0, maxevents=maxevents@entry=16, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007ffff72c7fe1 in CxPlatDataPathRunEC (Context=Context@entry=0x6190000008d0, CurThreadId=<optimized out>, WaitTime=WaitTime@entry=4294967295) at ../nng/extern/msquic/src/platform/datapath_epoll.c:2753
#2  0x00007ffff72d1d75 in CxPlatWorkerThread (Context=0x619000000850) at ../nng/extern/msquic/src/platform/platform_worker.c:280
#3  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 31 (Thread 0x7fffe4fe2700 (LWP 3558640)):
#0  0x00007ffff707e46e in epoll_wait (epfd=15, events=events@entry=0x7fffe4fe1cb0, maxevents=maxevents@entry=16, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007ffff72c7fe1 in CxPlatDataPathRunEC (Context=Context@entry=0x619000000840, CurThreadId=<optimized out>, WaitTime=WaitTime@entry=4294967295) at ../nng/extern/msquic/src/platform/datapath_epoll.c:2753
#2  0x00007ffff72d1d75 in CxPlatWorkerThread (Context=0x6190000007c0) at ../nng/extern/msquic/src/platform/platform_worker.c:280
#3  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 30 (Thread 0x7fffe57e3700 (LWP 3558639)):
#0  0x00007ffff707e46e in epoll_wait (epfd=13, events=events@entry=0x7fffe57e2cb0, maxevents=maxevents@entry=16, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007ffff72c7fe1 in CxPlatDataPathRunEC (Context=Context@entry=0x6190000007b0, CurThreadId=<optimized out>, WaitTime=WaitTime@entry=4294967295) at ../nng/extern/msquic/src/platform/datapath_epoll.c:2753
#2  0x00007ffff72d1d75 in CxPlatWorkerThread (Context=0x619000000730) at ../nng/extern/msquic/src/platform/platform_worker.c:280
#3  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 29 (Thread 0x7fffe5fe4700 (LWP 3558638)):
#0  0x00007ffff707e46e in epoll_wait (epfd=11, events=events@entry=0x7fffe5fe3cb0, maxevents=maxevents@entry=16, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007ffff72c7fe1 in CxPlatDataPathRunEC (Context=Context@entry=0x619000000720, CurThreadId=<optimized out>, WaitTime=WaitTime@entry=4294967295) at ../nng/extern/msquic/src/platform/datapath_epoll.c:2753
#2  0x00007ffff72d1d75 in CxPlatWorkerThread (Context=0x6190000006a0) at ../nng/extern/msquic/src/platform/platform_worker.c:280
#3  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 28 (Thread 0x7fffe67e5700 (LWP 3558637)):
#0  0x00007ffff707e46e in epoll_wait (epfd=9, events=events@entry=0x7fffe67e4cb0, maxevents=maxevents@entry=16, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007ffff72c7fe1 in CxPlatDataPathRunEC (Context=Context@entry=0x619000000690, CurThreadId=<optimized out>, WaitTime=WaitTime@entry=4294967295) at ../nng/extern/msquic/src/platform/datapath_epoll.c:2753
#2  0x00007ffff72d1d75 in CxPlatWorkerThread (Context=0x619000000610) at ../nng/extern/msquic/src/platform/platform_worker.c:280
#3  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 27 (Thread 0x7fffe6fe6700 (LWP 3558636)):
#0  0x00007ffff707e46e in epoll_wait (epfd=7, events=events@entry=0x7fffe6fe5cb0, maxevents=maxevents@entry=16, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00007ffff72c7fe1 in CxPlatDataPathRunEC (Context=Context@entry=0x619000000600, CurThreadId=<optimized out>, WaitTime=WaitTime@entry=4294967295) at ../nng/extern/msquic/src/platform/datapath_epoll.c:2753
#2  0x00007ffff72d1d75 in CxPlatWorkerThread (Context=0x619000000580) at ../nng/extern/msquic/src/platform/platform_worker.c:280
#3  0x00007ffff7182609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4  0x00007ffff707e133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

wanghaEMQ avatar Sep 01 '22 07:09 wanghaEMQ

You can't use an existing Connection's callback to make calls on (or create a) different connection. Unfortunately, our documentation doesn't quite make that requirement clear.

nibanks avatar Sep 01 '22 11:09 nibanks

Thanks, and does msquic provide any elegant ways to do reconnect? (in callback or not)

wanghaEMQ avatar Sep 03 '22 04:09 wanghaEMQ

No, we don't have anything special. You just have to do it on your own thread.

nibanks avatar Sep 03 '22 12:09 nibanks

ok, thanks a lot

wanghaEMQ avatar Sep 04 '22 08:09 wanghaEMQ

Walkthrough

The documentation needs to be updated to be more clear on needing to use a separate thread when calling cross-connection.

nibanks avatar May 08 '23 22:05 nibanks