hidapi icon indicating copy to clipboard operation
hidapi copied to clipboard

hidapi/windows: do not wait in GetOverlappedResult() in hid_read_timeout()

Open slouken opened this issue 2 years ago • 4 comments

This is unsafe because the event is auto-reset, therefore the call to WaitForSingleObject() resets the event which GetOverlappedResult() will try to wait on.

Even though the overlapped operation is guaranteed to be completed at the point we call GetOverlappedResult(), it will still wait on the event handle for a short time to trigger the reset for auto-reset events. This amounts to roughly a 100 ms sleep each time GetOverlappedResult() is called for a completed I/O with a non-signalled event.

In the context of HIDAPI, this extra sleep means that callers that loop on hid_read_timeout() with timeout=0 will loop forever, since the 100 ms sleep each iteration ensures ReadFile() will always have new data.

slouken avatar May 26 '23 16:05 slouken

This looks valid, but how do we test the before/after?

Youw avatar Jun 01 '23 14:06 Youw

@cgutman, do you have a test framework so the hidapi folks can validate this change?

slouken avatar Jun 01 '23 14:06 slouken

@slouken and @cgutman

Any updates?

mcuee avatar Oct 19 '23 00:10 mcuee

I don't have a test case other than calling hid_read_timeout() in a loop with timeout 0 and hardware that produces HID reports more than every 100ms when hidapi.dll is compiled with a subsystem version value in the PE header of 6.1 or later (which triggers the Windows 7 app compat behavior for GetOverlappedResult()).

The docs on overlapped I/O synchronization allude to this behavior change on binaries targeting Windows 7 in the last paragraph regarding using WaitForSingleObject() with GetOverlappedResult(). There is further documentation on that change in the Application Manifest docs.

I believe the race condition Microsoft is addressing here is the following case:

  • Thread 1 - completes an overlapped I/O and sets OVERLAPPED.Internal = STATUS_SUCCESS
  • Thread 2 - calls GetOverlappedResult() with an auto-reset event in in OVERLAPPED.hEvent
  • Thread 2 - GetOverlappedResult() observes OVERLAPPED.Internal != STATUS_PENDING and returns TRUE
  • Thread 2 - issues a new overlapped I/O request using the same OVERLAPPED structure
  • Thread 1 - calls SetEvent(OVERLAPPED.hEvent) on the now-reused OVERLAPPED structure
  • Thread 2 - calls GetOverlappedResult() again which is falsely awoken by T1 before the new I/O request has completed

It is the changes to resolve the race condition mentioned in those docs that cause the bug fixed in this PR. Since the existing code waits on the auto-reset event prior to calling GetOverlappedResult() with wait=TRUE, GetOverlappedResult() will perform a 100ms wait on our behalf in order to ensure the auto-reset event is reset. That wait will time out because we already reset the event in the immediately preceding WaitForSingleObject() call.

The 100 ms wait itself is not documented officially anywhere, but it is clearly visible if you look at GetOverlappedResult() in a decompiler/disassembler such as Ghidra. Here is the decompiled source with my annotations:

BOOL __stdcall
GetOverlappedResult(HANDLE hFile,LPOVERLAPPED lpOverlapped,LPDWORD lpNumberOfBytesTransferred,
                   BOOL bWait)

{
  DWORD DVar1;
  BOOL BVar2;
  code *pcVar3;
  HANDLE hHandle;
  int local_res20 [2];
  
                    /* 0x26430  667  GetOverlappedResult */
  if (bWait == 0) {
    if (*(int *)&lpOverlapped->Internal == 0x103) {
      RtlSetLastWin32Error(0x3e4);
      return 0;
    }
    LOCK();
    UNLOCK();
  }
  else {
    local_res20[0] = 1; // Use Win7 behavior by default
    DVar1 = 0xffffffff; // INFINITE wait timeout by default
    pcVar3 = (code *)SbSelectProcedure(0xabababab,1,&DAT_1801901f0,0);
    if (pcVar3 != (code *)0x0) {
      (*pcVar3)(local_res20); // Query app compat APIs to determine whether to use Vista or Win7+ behavior
    }
    if (local_res20[0] == 0) {
      if (lpOverlapped->Internal != 0x103) goto LAB_180026467; // If we're using Vista behavior, exit without waiting if the I/O is not pending
    }
    else if (lpOverlapped->Internal != 0x103) {
      DVar1 = 100; // If we're using Win7 behavior and I/O is not pending, change the wait timeout to 100 ms
    }
    hHandle = (HANDLE)((ulonglong)hFile & 0xfffffffffffffffe); // Wait on the file handle unless hEvent is specified
    if (lpOverlapped->hEvent != (HANDLE)0x0) {
      hHandle = lpOverlapped->hEvent; // Wait on the event handle if hEvent is specified
    }
    DVar1 = WaitForSingleObjectEx(hHandle,DVar1,0); // Perform the wait
    if (DVar1 != 0) {
      BVar2 = FUN_1800df938();
      return BVar2;
    }
  }
LAB_180026467:
  FUN_180026730(*(undefined4 *)&lpOverlapped->Internal);
  *lpNumberOfBytesTransferred = *(DWORD *)&lpOverlapped->InternalHigh;
  return (uint)(-1 < *(int *)&lpOverlapped->Internal);
}

IMHO, it's pretty clear based on this code and other documentation that asking GetOverlappedResult() to wait on an auto-reset after WaitForSingleObject has already reset it is wrong and the fix in this PR is correct. It worked before with the Vista compatibility behavior because GetOverlappedResult() was skipping the wait in that case (and thus also not resetting the event), but it causes a 100ms delay inside every call to hid_read_timeout() if Win7+ behavior is in effect.

cgutman avatar Oct 19 '23 02:10 cgutman

@Youw

Let me carry out some tests with avrdude to see if there are any regressions. I do not have issues with hidapi-0.14.0 release (done the testing for avrdude 7.3 release recently). https://github.com/avrdudes/avrdude/blob/main/src/usb_hidapi.c

mcuee avatar Mar 05 '24 12:03 mcuee

@Youw

No regressions found. I think this PR is good to go.

Test with autotools build of avrdude git main, as well as hidapi DLL built using auto-tools as well.

$ ldd ./avrdude.exe
        ntdll.dll => /c/WINDOWS/SYSTEM32/ntdll.dll (0x7ffd165f0000)
        KERNEL32.DLL => /c/WINDOWS/System32/KERNEL32.DLL (0x7ffd158e0000)
        KERNELBASE.dll => /c/WINDOWS/System32/KERNELBASE.dll (0x7ffd13c30000)
        msvcrt.dll => /c/WINDOWS/System32/msvcrt.dll (0x7ffd15150000)
        SETUPAPI.dll => /c/WINDOWS/System32/SETUPAPI.dll (0x7ffd15a10000)
        HID.DLL => /c/WINDOWS/SYSTEM32/HID.DLL (0x7ffd12140000)
        WS2_32.dll => /c/WINDOWS/System32/WS2_32.dll (0x7ffd15400000)
        libftdi1.dll => /mingw64/bin/libftdi1.dll (0x7ffcf9eb0000)
        RPCRT4.dll => /c/WINDOWS/System32/RPCRT4.dll (0x7ffd15580000)
        libhidapi-0.dll => /mingw64/bin/libhidapi-0.dll (0x7ffcf86b0000)
        libreadline8.dll => /mingw64/bin/libreadline8.dll (0x7ffcf3f90000)
        libserialport-0.dll => /mingw64/bin/libserialport-0.dll (0x7ffcfe560000)
        libusb-1.0.dll => /mingw64/bin/libusb-1.0.dll (0x7ffcce550000)
        USER32.dll => /c/WINDOWS/System32/USER32.dll (0x7ffd15200000)
        ADVAPI32.dll => /c/WINDOWS/System32/ADVAPI32.dll (0x7ffd16420000)
        libusb-0-1-4.dll => /mingw64/bin/libusb-0-1-4.dll (0x7ffcf2770000)
        win32u.dll => /c/WINDOWS/System32/win32u.dll (0x7ffd141f0000)
        sechost.dll => /c/WINDOWS/System32/sechost.dll (0x7ffd14f30000)
        GDI32.dll => /c/WINDOWS/System32/GDI32.dll (0x7ffd159e0000)
        bcrypt.dll => /c/WINDOWS/System32/bcrypt.dll (0x7ffd141c0000)
        gdi32full.dll => /c/WINDOWS/System32/gdi32full.dll (0x7ffd14220000)
        msvcp_win.dll => /c/WINDOWS/System32/msvcp_win.dll (0x7ffd13fe0000)
        libtermcap-0.dll => /mingw64/bin/libtermcap-0.dll (0x7ffcf9ac0000)
        ucrtbase.dll => /c/WINDOWS/System32/ucrtbase.dll (0x7ffd13b10000)
        cfgmgr32.DLL => /c/WINDOWS/SYSTEM32/cfgmgr32.DLL (0x7ffd13620000)

Steps to build libhidapi-0.dll from this PR rebased to git.

  git clone https://github.com/libusb/hidapi.git hidapi_pr577
  cd ./hidapi_pr577
  git fetch origin pull/577/head
  git checkout -b pr577_rebase FETCH_HEAD
  git rebase origin/master
  ./bootstrap
  mkdir build_autotools
  cd build_autotools/
  ../configure --prefix=/mingw64
  make
  make install

Then I run avrdude.

$ ./avrdude.exe -c pkobn_updi -p 64ea48 -U ../../tools/test_files/the_quick_brown_fox_4096B.hex
avrdude: AVR device initialized and ready to accept instructions
avrdude: device signature = 0x1e961e (probably 64ea48)
avrdude: Note: flash memory has been specified, an erase cycle will be performed.
         To disable this feature, specify the -D option.
avrdude: erasing chip

avrdude: processing -U flash:w:../../tools/test_files/the_quick_brown_fox_4096B.hex:i
avrdude: reading input file ../../tools/test_files/the_quick_brown_fox_4096B.hex for flash
         with 4096 bytes in 1 section within [0, 0xfff]
         using 32 pages and 0 pad bytes
avrdude: writing 4096 bytes flash ...
Writing | ################################################## | 100% 1.96 s
avrdude: 4096 bytes of flash written
avrdude: verifying flash memory against ../../tools/test_files/the_quick_brown_fox_4096B.hex
Reading | ################################################## | 100% 0.99 s
avrdude: 4096 bytes of flash verified

avrdude done.  Thank you.

Similarly I generate libhidapi-0.dll from hidapi git and run avrdude again -- no issues as well.

mcuee avatar Mar 05 '24 12:03 mcuee

thanks

Youw avatar Mar 05 '24 12:03 Youw