msquic increase spinquic watchdog timeout

Description

As discussed in issue #5491 , from logs, the watchdog assert is firing. For now, let's increase it by 100%.

Testing

CI

Documentation

N/A

Dec 09 '25 21:12 ProjectsByJackHe

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 85.64%. Comparing base (4e84609) to head (18ddaa0). :warning: Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5647      +/-   ##
==========================================
- Coverage   86.34%   85.64%   -0.71%     
==========================================
  Files          60       60              
  Lines       18663    18663              
==========================================
- Hits        16114    15983     -131     
- Misses       2549     2680     +131

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Dec 09 '25 22:12 codecov[bot]

I am not that familiar with the spin test, but isn't this only going to cause the spintest to run for a longer time? Looking at the sources very fast, the time you change control the time spent spinning, and there is a WATCHDOG_WIGGLE_ROOM that gives a bit of extra time for the watchdog.

Dec 09 '25 22:12 guhetier

I am not that familiar with the spin test, but isn't this only going to cause the spintest to run for a longer time? Looking at the sources very fast, the time you change control the time spent spinning, and there is a WATCHDOG_WIGGLE_ROOM that gives a bit of extra time for the watchdog.

Yes! good catch

Dec 10 '25 01:12 ProjectsByJackHe

Did you investigate, based on the traces, what was pending when the timeout fired? 2 / 3 seconds is already quite a lot. It is possible something was delayed on a slow VM, but it is possible too that a softlock / deadlock was happening in MsQuic.

Dec 10 '25 17:12 guhetier

Did you investigate, based on the traces, what was pending when the timeout fired? 2 / 3 seconds is already quite a lot. It is possible something was delayed on a slow VM, but it is possible too that a softlock / deadlock was happening in MsQuic.

Based on the ETL trace from the link I added in the issue, I couldn't find any deadlocks happening. Although, there are comments in SpinQuic itself that notes certain code paths will lead to deadlocks, but those are all disabled.

Dec 10 '25 19:12 ProjectsByJackHe

Ok. This might help, but I suspect going from 2sec to 3sec won't be a definitive fix. We should make sure dumps are collected so that next time, we can check the state of pending threads.

Dec 10 '25 21:12 guhetier