usbip_windows icon indicating copy to clipboard operation
usbip_windows copied to clipboard

From SourceForge: BSoD on detach.

Open Oxalin opened this issue 8 years ago • 7 comments

Reported by: DanT Date: Link: https://sourceforge.net/p/usbip/discussion/418507/thread/7ff86875/?limit=25&page=1#f556

Description: BSoD on detach in "complete_pending_irp" in the function "bus_unplug_dev".

This may have been fixed by commit b7bfa2c67d68096882cfd7dfa1e257122b7dfca9 I included from Daniel Mitchell's patch. There was at least two possible errors with IRQL not being set correctly. However, I suspect another error could still happen under complete_pending_irp() in pnp.c.

Oxalin avatar Jan 31 '18 07:01 Oxalin

What needs to be done:

  • Review the state of complete_pending_irp() and make sure there is no possible case where IRQL could not be set or restored appropriately.

Oxalin avatar Jan 31 '18 07:01 Oxalin

From investigation: it seems one of the previous developers had hit a very similar bug elsewhere in the code [process_write_irp()] where he had to pad IO calls with IRQL raising and lowering. We may need to do the same elsewhere in the code, not just under complete_pending_irp().

Oxalin avatar Feb 07 '18 19:02 Oxalin

FYI: recompiled with the server-side accepted version number (111) and got the following stack trace from the kernel memory dump after detaching a bluetooth dongle (bsod):

nt!KeBugCheckEx
nt!KiBugCheckDispatch + 0x69
nt!KiPageFault + 0x519
USBIPEnum + 0x225e
nt!IoCancelIrp + 0x71
BTHUSB!UsbWrapCancelAllPingPongIrps + 0xe5
BTHUSB!USBStopInterruptTransfers + 0x50
BTHUSB!BthUsb_SetPipeState + 0xa0
BTHUSB!BthUsb_HandleStateChange + 0x6a
BTHUSB!BthUsb_PnpRemove + 0x7c
bthport!BthProcessStateChange + 0x132
bthport!BthProcessRemove + 0x147
bthport!BthHandleSurpriseRemoval + 0xa1
bthport!BthHandlePnp + 0x1a7
bthport!BthDispatchPnp + 0x61
nt!IofCallDriver + 0x59
nt!IopSynchronousCall + 0xe5
nt!IopRemoveDevice + 0xdf
nt!PnpSurpriseRemoveLockedDeviceNode + 0xba
nt!PnpDeleteLockedDeviceNode + 0xaf
nt!PnpDeleteLockedDeviceNodes + 0xb3
nt!PnpProcessQueryRemoveAndEject + 0x44a
nt!PnpProcessTargetDeviceEvent + 0xde
nt!PnpDeviceEventWorker + 0x29b
nt!ExpWorkerThread + 0xf5
nt!PspSystemThreadStartup + 0x47
nt!KiStartSystemThread + 0x16

dennisdegryse avatar Feb 26 '18 18:02 dennisdegryse

Hi @dennisdegryse . Thank you for your trace. I'm pretty sure I've pinpointed where the problem is, but not why this is happening. I'm mostly working on the usbip-tools for now, but I'll take whatever you can feed me on the driver side: which OS are you testing on? Which commit are you compiling? How did you disconnect your device? Did it generate a "IRQL not less or equal" or was it a different message?

From what I can read, you generated a surprise removal: the PNP process is called and it deals with a Remove and Eject query, than it continues deleting devices nodes, going to a specific node and it is treated as a SupriseRemoveLockedDeviceNode... Then it calls the bluetooth driver (bthport), which in turn also deals with the suprise removal, stopping transfers, canceling all IRPs and this is where it fails under USBIPEnum, generating a page fault.

Oxalin avatar Feb 26 '18 23:02 Oxalin

@dennisdegryse : also, could you attach the dump file?

Oxalin avatar Feb 26 '18 23:02 Oxalin

ATM I only have a full memory dump of 1.2GB, from a non-sandboxed environment (may contain info I don't want to leak). I'll set up a VM for a reproduction and new dump asap. Do you want the full memory dump or will a minidump suffice?

dennisdegryse avatar Feb 27 '18 11:02 dennisdegryse

@dennisdegryse : If you were testing using the master/HEAD, I pushed a fix a few minutes ago (well, I hope this will work).

IoCancelIrp() calls the driver's cancel IRP routine, which is cancel_irp(). At first, I tought we were hitting a wrongly assumed IRQL at DISPATCH_LEVEL. However, after digging in Microsoft's documentation, I think I have properly fixed the code.

If you want to give it a try and let me know (I still haven't worked on the server-side accepted version number (111)) if this fixes your problem.

Oxalin avatar Feb 28 '18 08:02 Oxalin