hidapi/windows: do not wait in GetOverlappedResult() in hid_read_timeout()
This is unsafe because the event is auto-reset, therefore the call to WaitForSingleObject() resets the event which GetOverlappedResult() will try to wait on.
Even though the overlapped operation is guaranteed to be completed at the point we call GetOverlappedResult(), it will still wait on the event handle for a short time to trigger the reset for auto-reset events. This amounts to roughly a 100 ms sleep each time GetOverlappedResult() is called for a completed I/O with a non-signalled event.
In the context of HIDAPI, this extra sleep means that callers that loop on hid_read_timeout() with timeout=0 will loop forever, since the 100 ms sleep each iteration ensures ReadFile() will always have new data.
This looks valid, but how do we test the before/after?
@cgutman, do you have a test framework so the hidapi folks can validate this change?
@slouken and @cgutman
Any updates?
I don't have a test case other than calling hid_read_timeout() in a loop with timeout 0 and hardware that produces HID reports more than every 100ms when hidapi.dll is compiled with a subsystem version value in the PE header of 6.1 or later (which triggers the Windows 7 app compat behavior for GetOverlappedResult()).
The docs on overlapped I/O synchronization allude to this behavior change on binaries targeting Windows 7 in the last paragraph regarding using WaitForSingleObject() with GetOverlappedResult(). There is further documentation on that change in the Application Manifest docs.
I believe the race condition Microsoft is addressing here is the following case:
- Thread 1 - completes an overlapped I/O and sets
OVERLAPPED.Internal = STATUS_SUCCESS - Thread 2 - calls
GetOverlappedResult()with an auto-reset event in inOVERLAPPED.hEvent - Thread 2 -
GetOverlappedResult()observesOVERLAPPED.Internal != STATUS_PENDINGand returns TRUE - Thread 2 - issues a new overlapped I/O request using the same
OVERLAPPEDstructure - Thread 1 - calls
SetEvent(OVERLAPPED.hEvent)on the now-reusedOVERLAPPEDstructure - Thread 2 - calls
GetOverlappedResult()again which is falsely awoken by T1 before the new I/O request has completed
It is the changes to resolve the race condition mentioned in those docs that cause the bug fixed in this PR. Since the existing code waits on the auto-reset event prior to calling GetOverlappedResult() with wait=TRUE, GetOverlappedResult() will perform a 100ms wait on our behalf in order to ensure the auto-reset event is reset. That wait will time out because we already reset the event in the immediately preceding WaitForSingleObject() call.
The 100 ms wait itself is not documented officially anywhere, but it is clearly visible if you look at GetOverlappedResult() in a decompiler/disassembler such as Ghidra. Here is the decompiled source with my annotations:
BOOL __stdcall
GetOverlappedResult(HANDLE hFile,LPOVERLAPPED lpOverlapped,LPDWORD lpNumberOfBytesTransferred,
BOOL bWait)
{
DWORD DVar1;
BOOL BVar2;
code *pcVar3;
HANDLE hHandle;
int local_res20 [2];
/* 0x26430 667 GetOverlappedResult */
if (bWait == 0) {
if (*(int *)&lpOverlapped->Internal == 0x103) {
RtlSetLastWin32Error(0x3e4);
return 0;
}
LOCK();
UNLOCK();
}
else {
local_res20[0] = 1; // Use Win7 behavior by default
DVar1 = 0xffffffff; // INFINITE wait timeout by default
pcVar3 = (code *)SbSelectProcedure(0xabababab,1,&DAT_1801901f0,0);
if (pcVar3 != (code *)0x0) {
(*pcVar3)(local_res20); // Query app compat APIs to determine whether to use Vista or Win7+ behavior
}
if (local_res20[0] == 0) {
if (lpOverlapped->Internal != 0x103) goto LAB_180026467; // If we're using Vista behavior, exit without waiting if the I/O is not pending
}
else if (lpOverlapped->Internal != 0x103) {
DVar1 = 100; // If we're using Win7 behavior and I/O is not pending, change the wait timeout to 100 ms
}
hHandle = (HANDLE)((ulonglong)hFile & 0xfffffffffffffffe); // Wait on the file handle unless hEvent is specified
if (lpOverlapped->hEvent != (HANDLE)0x0) {
hHandle = lpOverlapped->hEvent; // Wait on the event handle if hEvent is specified
}
DVar1 = WaitForSingleObjectEx(hHandle,DVar1,0); // Perform the wait
if (DVar1 != 0) {
BVar2 = FUN_1800df938();
return BVar2;
}
}
LAB_180026467:
FUN_180026730(*(undefined4 *)&lpOverlapped->Internal);
*lpNumberOfBytesTransferred = *(DWORD *)&lpOverlapped->InternalHigh;
return (uint)(-1 < *(int *)&lpOverlapped->Internal);
}
IMHO, it's pretty clear based on this code and other documentation that asking GetOverlappedResult() to wait on an auto-reset after WaitForSingleObject has already reset it is wrong and the fix in this PR is correct. It worked before with the Vista compatibility behavior because GetOverlappedResult() was skipping the wait in that case (and thus also not resetting the event), but it causes a 100ms delay inside every call to hid_read_timeout() if Win7+ behavior is in effect.
@Youw
Let me carry out some tests with avrdude to see if there are any regressions. I do not have issues with hidapi-0.14.0 release (done the testing for avrdude 7.3 release recently). https://github.com/avrdudes/avrdude/blob/main/src/usb_hidapi.c
@Youw
No regressions found. I think this PR is good to go.
Test with autotools build of avrdude git main, as well as hidapi DLL built using auto-tools as well.
$ ldd ./avrdude.exe
ntdll.dll => /c/WINDOWS/SYSTEM32/ntdll.dll (0x7ffd165f0000)
KERNEL32.DLL => /c/WINDOWS/System32/KERNEL32.DLL (0x7ffd158e0000)
KERNELBASE.dll => /c/WINDOWS/System32/KERNELBASE.dll (0x7ffd13c30000)
msvcrt.dll => /c/WINDOWS/System32/msvcrt.dll (0x7ffd15150000)
SETUPAPI.dll => /c/WINDOWS/System32/SETUPAPI.dll (0x7ffd15a10000)
HID.DLL => /c/WINDOWS/SYSTEM32/HID.DLL (0x7ffd12140000)
WS2_32.dll => /c/WINDOWS/System32/WS2_32.dll (0x7ffd15400000)
libftdi1.dll => /mingw64/bin/libftdi1.dll (0x7ffcf9eb0000)
RPCRT4.dll => /c/WINDOWS/System32/RPCRT4.dll (0x7ffd15580000)
libhidapi-0.dll => /mingw64/bin/libhidapi-0.dll (0x7ffcf86b0000)
libreadline8.dll => /mingw64/bin/libreadline8.dll (0x7ffcf3f90000)
libserialport-0.dll => /mingw64/bin/libserialport-0.dll (0x7ffcfe560000)
libusb-1.0.dll => /mingw64/bin/libusb-1.0.dll (0x7ffcce550000)
USER32.dll => /c/WINDOWS/System32/USER32.dll (0x7ffd15200000)
ADVAPI32.dll => /c/WINDOWS/System32/ADVAPI32.dll (0x7ffd16420000)
libusb-0-1-4.dll => /mingw64/bin/libusb-0-1-4.dll (0x7ffcf2770000)
win32u.dll => /c/WINDOWS/System32/win32u.dll (0x7ffd141f0000)
sechost.dll => /c/WINDOWS/System32/sechost.dll (0x7ffd14f30000)
GDI32.dll => /c/WINDOWS/System32/GDI32.dll (0x7ffd159e0000)
bcrypt.dll => /c/WINDOWS/System32/bcrypt.dll (0x7ffd141c0000)
gdi32full.dll => /c/WINDOWS/System32/gdi32full.dll (0x7ffd14220000)
msvcp_win.dll => /c/WINDOWS/System32/msvcp_win.dll (0x7ffd13fe0000)
libtermcap-0.dll => /mingw64/bin/libtermcap-0.dll (0x7ffcf9ac0000)
ucrtbase.dll => /c/WINDOWS/System32/ucrtbase.dll (0x7ffd13b10000)
cfgmgr32.DLL => /c/WINDOWS/SYSTEM32/cfgmgr32.DLL (0x7ffd13620000)
Steps to build libhidapi-0.dll from this PR rebased to git.
git clone https://github.com/libusb/hidapi.git hidapi_pr577
cd ./hidapi_pr577
git fetch origin pull/577/head
git checkout -b pr577_rebase FETCH_HEAD
git rebase origin/master
./bootstrap
mkdir build_autotools
cd build_autotools/
../configure --prefix=/mingw64
make
make install
Then I run avrdude.
$ ./avrdude.exe -c pkobn_updi -p 64ea48 -U ../../tools/test_files/the_quick_brown_fox_4096B.hex
avrdude: AVR device initialized and ready to accept instructions
avrdude: device signature = 0x1e961e (probably 64ea48)
avrdude: Note: flash memory has been specified, an erase cycle will be performed.
To disable this feature, specify the -D option.
avrdude: erasing chip
avrdude: processing -U flash:w:../../tools/test_files/the_quick_brown_fox_4096B.hex:i
avrdude: reading input file ../../tools/test_files/the_quick_brown_fox_4096B.hex for flash
with 4096 bytes in 1 section within [0, 0xfff]
using 32 pages and 0 pad bytes
avrdude: writing 4096 bytes flash ...
Writing | ################################################## | 100% 1.96 s
avrdude: 4096 bytes of flash written
avrdude: verifying flash memory against ../../tools/test_files/the_quick_brown_fox_4096B.hex
Reading | ################################################## | 100% 0.99 s
avrdude: 4096 bytes of flash verified
avrdude done. Thank you.
Similarly I generate libhidapi-0.dll from hidapi git and run avrdude again -- no issues as well.
thanks