usbipd-win icon indicating copy to clipboard operation
usbipd-win copied to clipboard

URB_BULK transfers work only once, then never again until device is replugged.

Open danielk-98 opened this issue 1 year ago • 17 comments

Hello! First off, thanks so much for your work on this project. It's crazy how important this is for Windows and WSL, and it's a shame it's not receiving more direct support from Microsoft.

The problem: I have a microcontroller which successfully attaches to WSL. A program (based on libusb) connects to the device and requests some information. The program successfully transfers the data the first time it is run; running the program a second time fails.

In WSL, "dmesg" shows errors like:

  • vhci_hcd: vhci_device speed not set
  • vhci_hcd: the urb (seqnum 106) was already given back

Host: Windows 11 build 22000 WSL version: 1.2.0.0 Kernel version: 5.15.90.1 WSLg version: 1.0.51 MSRDC version: 1.2.3770 Direct3D version: 1.608.2-61064218 DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp Windows version: 10.0.22000.1817 USBIPD 2.4.1+1.Branch.master.Sha.90acf9456020ca8c6310ca62a71ee23cb6ca34ad

Guest: Ubuntu 20.04 Kernel: 5.15.90.1-microsoft-standard-WSL2

Procedure:

1. USBIPD server is started with logging:

usbipd server Logging:LogLevel:Default=Trace 'usbipd:PcapNg:Path=C:\Users\Daniel\Desktop\usbip.pcapng'

2. Device is attached from Windows:

usbipd wsl attach -d ubuntu20test -i 0694:0002

Server and "dmesg" logs immediately after connection:

dbug: Usbipd.Server[1000]
      new connection from 172.19.237.228
dbug: Usbipd.Server[1000]
      connection close: Attempted to read past the end of the stream.
dbug: Usbipd.Server[1000]
      connection closed
dbug: Usbipd.Server[1000]
      new connection from 172.19.237.228
dbug: Usbipd.ConnectedClient[1000]
      Received opcode: OP_REQ_IMPORT
info: Usbipd.ConnectedClient[1]
      Client 172.19.237.228 claimed device at 6-4 (USB\VID_0694&PID_0002\0016530C69D5).
dbug: Usbipd.AttachedClient[1000]
      USBSUP_XFER_STALL -> EPIPE -> -32
dbug: Usbipd.AttachedClient[1000]
      USBSUP_XFER_STALL -> EPIPE -> -32
dbug: Usbipd.AttachedClient[1000]
      USBSUP_XFER_STALL -> EPIPE -> -32
dbug: Usbipd.AttachedClient[1000]
      Trapped SET_CONFIGURATION: 1
[ 2238.118580] vhci_hcd vhci_hcd.0: pdev(0) rhport(0) sockfd(3)
[ 2238.119114] vhci_hcd vhci_hcd.0: devid(393220) speed(2) speed_str(full-speed)
[ 2238.119497] vhci_hcd vhci_hcd.0: Device attached
[ 2238.391478] vhci_hcd: vhci_device speed not set
[ 2238.461381] usb 1-1: new full-speed USB device number 4 using vhci_hcd
[ 2238.551402] vhci_hcd: vhci_device speed not set
[ 2238.621366] usb 1-1: SetAddress Request (4) to port 0
[ 2238.661024] usb 1-1: New USB device found, idVendor=0694, idProduct=0002, bcdDevice= 0.00
[ 2238.661377] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=1
[ 2238.661633] usb 1-1: SerialNumber: 0016530C69D5

3. WSL shows the device is available at 1-4:

$ lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 004: ID 0694:0002 Lego Group Mindstorms NXT
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

4. First transaction with the device is successful:

$ nexttool -deviceinfo
Brick name = NXT4
Bluetooth Address = 00:16:53:0C:69:D5
Bluetooth signal strength = 0,0,0,0
Free memory = 129504

Server and dmesg:

dbug: Usbipd.AttachedClient[1000]
      Trapped SET_CONFIGURATION: 1
[ 2415.941405] vhci_hcd: vhci_device speed not set
[ 2416.011386] usb 1-1: reset full-speed USB device number 4 using vhci_hcd
[ 2416.091365] vhci_hcd: vhci_device speed not set
[ 2416.161394] usb 1-1: SetAddress Request (4) to port 0

5. Second transaction with the device fails:

$ nexttool -deviceinfo

Server and dmesg:

dbug: Usbipd.AttachedClient[1000]
      Trapped SET_CONFIGURATION: 1
trce: Usbipd.AttachedClient[1001]
      Unlinking 105, pending = True, pending count = 1
trce: Usbipd.AttachedClient[1001]
      Aborting endpoint 130
dbug: Usbipd.AttachedClient[1000]
      USBSUP_XFER_STALL -> EPIPE -> -32
[ 2514.461372] vhci_hcd: vhci_device speed not set
[ 2514.531477] usb 1-1: reset full-speed USB device number 4 using vhci_hcd
[ 2514.611432] vhci_hcd: vhci_device speed not set
[ 2514.681471] usb 1-1: SetAddress Request (4) to port 0
[ 2515.727820] vhci_hcd: unlink->seqnum 106
[ 2515.728077] vhci_hcd: the urb (seqnum 106) was already given back

6. Device is detached from Windows:

usbipd wsl detach -i 0694:0002

Server and dmesg:

dbug: Usbipd.ConnectedClient[1000]
      Unbind or unplug while attached
info: Usbipd.ConnectedClient[2]
      Client 172.19.237.228 released device at 6-4 (USB\VID_0694&PID_0002\0016530C69D5).
dbug: Usbipd.Server[1000]
      connection close: The operation was canceled.
dbug: Usbipd.Server[1000]
      connection closed
[ 2676.358117] vhci_hcd: connection closed
[ 2676.358229] vhci_hcd: stop threads
[ 2676.358896] vhci_hcd: release socket
[ 2676.359092] vhci_hcd: disconnect device
[ 2676.359385] usb 1-1: USB disconnect, device number 4

Things I've Tried

  • Booting WSL2 with and without SystemD init.
  • Modifying UDEV rules.
  • Running "nexttool" with and without "sudo".
  • Running WSL2 from an Administrator terminal.
  • Restarting UDEV service

I have no problems on VirtualBox with Ubuntu 20.04, so it seems likely to be an issue with USBIPD or with the WSL kernel.

Could this be similar to 530? https://github.com/dorssel/usbipd-win/issues/530

I've attached the full server and dmesg logs, along with the PcapNg for this session.

Thank you for the help! dmesg.txt serverlog.txt pcapng.zip

danielk-98 avatar Apr 14 '23 01:04 danielk-98

The only weird thing that I can see in this log is:

trce: Usbipd.AttachedClient[1001]
      Unlinking 105, pending = True, pending count = 1
trce: Usbipd.AttachedClient[1001]
      Aborting endpoint 130
dbug: Usbipd.AttachedClient[1000]
      USBSUP_XFER_STALL -> EPIPE -> -32

This is indeed the last URB that is logged in the PCap as well. The Linux log on the other hand shows:

[ 2515.727820] vhci_hcd: unlink->seqnum 106
[ 2515.728077] vhci_hcd: the urb (seqnum 106) was already given back

Which is weird, because the last URB is 105; there never was a URB 106. What driver is being used on the Linux side?

dorssel avatar Apr 15 '23 14:04 dorssel

Thanks for the help! It looks like no specific driver is used for the NXT; It lists as a "Generic USB device". So nexttool (and the other, similar programs) probably just use libusb to access the device directly, since most interactions are so simple.

Running sudo lspci -v shows:

  *-usbhost:0
       product: USB/IP Virtual Host Controller
       vendor: Linux 5.15.90.1-microsoft-standard-WSL2 vhci_hcd
       physical id: 1
       bus info: usb@1
       logical name: usb1
       version: 5.15
       capabilities: usb-2.00
       configuration: driver=hub slots=8 speed=480Mbit/s
     *-usb UNCLAIMED
          description: Generic USB device
          product: Mindstorms NXT
          vendor: Lego Group
          physical id: 1
          bus info: usb@1:1
          version: 0.00
          serial: 0016530C69D5
          capabilities: usb-2.00
          configuration: speed=12Mbit/s

Anything else I could try to collect more information?

danielk-98 avatar Apr 16 '23 07:04 danielk-98

I installed WireShark in a Virtualbox VM (Ubuntu 20.04) to see what a successful transaction looks like. Then I reran the test in WSL2 for comparison.

  1. Connect the NXT device to the VM.
  2. Run nexttool -deviceinfo
  3. Run nexttool -deviceinfo again (WSL fails at this point)
  4. Disconnect the device and save the USB logs

The two logs show almost identical URBs during this test. Only differences I can see are:

  • WSL always sends two GET DESCRIPTOR Request DEVICE but Virtualbox only sends one
  • WSL's last URB 105, URB_BULK in, fails with URB status: Broken pipe (-EPIPE) (-32) (which corresponds to the weird server logs you pointed out) and looses the 33-byte payload that is supposed to be there.

Not sure if this adds anything or not. Very confusing why the transaction only fails on the second run!

Here's the vbox pcap: virtualbox.zip

danielk-98 avatar Apr 16 '23 10:04 danielk-98

WSL always sends two GET DESCRIPTOR Request DEVICE but Virtualbox only sends one

This may be due to caching (which usbipd does not do). First, the OS queries "what device is this?". Based on the answer, a certain driver is asked to handle the device. That driver probably can serve multiple devices, so it too first asks "what device is this?" Maybe VirtualBox caches the result of the static reply (the device descriptor never changes), and does not actually ask the device again but simply returns the same result for the second query. Since the second query is never actually sent to the device, it does not get captured. In any case, this difference is benign.

WSL's last URB 105, URB_BULK in, fails with URB status: Broken pipe (-EPIPE) (-32)

This is (of course) the problem. All I can think of is that during "SET CONFIGURATION", maybe more needs to be done than just that. Some sort of pipe reset, or so. Just a wild guess ... I will look into it.

dorssel avatar Apr 18 '23 20:04 dorssel

Hi,

I have encounter similar problem and I followed the WSL installation instructions here: https://www.xda-developers.com/wsl-connect-usb-devices-windows-11/

There's a step that install the user space tools for USB/IP and a database of USB hardware identifiers... I'm not sure if that would related to the issues.

After all installation success and I connected my astronomical camera. The linux has a server running and the device was functioning probably (e.g. the driver can send commands to the device and responding), until the camera capture started and transfer the data back. At the data trasnfer via the USB, error shows up in dmesg.

From the error descriptions and the pcapng capture, the symptoms are very similar. The WLS URB encounter broken pipe with a URB# in the linux sub system but the same URB# did not appears in the WSL.

The dmesg is showing seqnum 527, 529, 531, etc but the WSL showing 528, 530, 532....

[ 110.868153] vhci_hcd: unlink->seqnum 527 [ 110.868155] vhci_hcd: the urb (seqnum 527) was already given back [ 111.419232] vhci_hcd: unlink->seqnum 529 [ 111.419250] vhci_hcd: the urb (seqnum 529) was already given back [ 111.970243] vhci_hcd: unlink->seqnum 531 [ 111.970246] vhci_hcd: the urb (seqnum 531) was already given back [ 112.521337] vhci_hcd: unlink->seqnum 533 [ 112.521340] vhci_hcd: the urb (seqnum 533) was already given back

dbug: Usbipd.AttachedClient[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: Usbipd.AttachedClient[1001] Unlinking 528, pending = True, pending count = 1 trce: Usbipd.AttachedClient[1001] Aborting endpoint 4 dbug: Usbipd.AttachedClient[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: Usbipd.AttachedClient[1001] Unlinking 530, pending = True, pending count = 1 trce: Usbipd.AttachedClient[1001] Aborting endpoint 4 dbug: Usbipd.AttachedClient[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: Usbipd.AttachedClient[1001] Unlinking 532, pending = True, pending count = 1 trce: Usbipd.AttachedClient[1001] Aborting endpoint 4 dbug: Usbipd.AttachedClient[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: Usbipd.AttachedClient[1001] Unlinking 534, pending = True, pending count = 1 trce: Usbipd.AttachedClient[1001] Aborting endpoint 4

Attached is the pcapng capture log, the linux dmesg and the wsl debug output

ubuntu_dmesg.txt wsl_log.txt usbipd.zip

sywong2000 avatar Aug 20 '23 07:08 sywong2000

This may have been fixed with #684. Could you give that build a try? The installer is at https://github.com/dorssel/usbipd-win/actions/runs/6061183362.

dorssel avatar Sep 03 '23 11:09 dorssel

Thank you. Unfortunately it's still not working for me if I'm correct the status -104 is ECONNRESET which is different from before I will try and capture more logs later today

Thanks again

[ 397.651145] vhci_hcd: unlink->seqnum 1893 [ 397.651148] vhci_hcd: urb->status -104 [ 398.202404] vhci_hcd: unlink->seqnum 1895 [ 398.202406] vhci_hcd: urb->status -104 [ 398.753862] vhci_hcd: unlink->seqnum 1897 [ 398.753864] vhci_hcd: urb->status -104 [ 399.305095] vhci_hcd: unlink->seqnum 1899 [ 399.305097] vhci_hcd: urb->status -104 [ 399.856204] vhci_hcd: unlink->seqnum 1901 [ 399.856207] vhci_hcd: urb->status -104 [ 400.407446] vhci_hcd: unlink->seqnum 1903

sywong2000 avatar Sep 05 '23 13:09 sywong2000

I have difficulties to get into more information after having the 3.1.1 msi installed. First WSL USB GUI reported some installation dependencies were missing and asking to re-install. After reinstall it keeps saying the same. Restart Windows and it seemed to work but then my device originally named "Atik Camera" is replaced as "USBIP Shared Device" (I can spot they in the same DEV ID). I tried using usbipd command to attached manually - attachment was successful but the device is not showing in the WSL guest Ubuntu. I will try to remove everything and try running via Visual Studio and see if I can get more information..... The only successful attempt I tried is that in VirtualBox Guest Linux, the device & the transfer works fine - so I can say the device do not have a problem working under usb/ip protocol... Yet I looked into the ECONNRESET status - it's a successful state for the urb unlink. So the problem may be caused by a different issue

sywong2000 avatar Sep 07 '23 05:09 sywong2000

WSL USB GUI reported some installation dependencies

3.1.1 is a dev version. I am not surprised it does not work with the external GUI tool. But you may want to report it to that project as well.

attachment was successful but the device is not showing in the WSL guest Ubuntu

That is weird.. What is the output of lsusb and do you have a syslog?

Yet I looked into the ECONNRESET status - it's a successful state for the urb unlink. So the problem may be caused by a different issue

Yes, the issue seems not related.

dorssel avatar Sep 07 '23 09:09 dorssel

I loaded the master branch into Visual Studio and have it started and running in the shell. I still see my device showing as "USBIP Shared Device" but there're no problems on attaching it now and showing OK in the WSL guest.

I started the camera capturing and found that although the dmesg print is not the same before but the problem symtom seemed to be the same, after checking the pcapng capture against the dmesg:

pcapng shows URB packets with URB ID (0a43=2627, 0a45=2629, etc) being submitted and completed but on the other hand dmesg shows none of those URB#. The dmesg showing unlink with URB ID 2626, 2628, 2630 but these numbers are not appearing in the URB trace pcapng

meanwhile the VS trace output shows USBSUP_XFER_STALL -> EPIPE -> -32

These all happens when the camera finished its capturing and trying to download the data from the camera. The camera driver showing BulkRead Failed. I checked at the exact time of the download starts, the URB packets started to slip... at the attached pcapng capstures, line#1796 the URB are good in sequence 09fc, 09fd, but after this point 09fe is missing and jumps to 09ff and 0a01 (0a00 is missing) and so on.... meanwhile these missing URB# are showing in the kernel unlink

I can try and trace and see what caused the USBSUP_XFER_STALL but I have very little experiences on the USB/IP. I will have to study and see how I can trap the packets and find out.

[ 1757.350935] vhci_hcd: unlink->seqnum 2624 [ 1757.350937] vhci_hcd: urb->status -104 [ 1757.902103] vhci_hcd: unlink->seqnum 2626 [ 1757.902106] vhci_hcd: urb->status -104 [ 1758.453133] vhci_hcd: unlink->seqnum 2628 [ 1758.453136] vhci_hcd: urb->status -104 [ 1759.004167] vhci_hcd: unlink->seqnum 2630

dbug: 7-3.4[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: 7-3.4[1001] Unlinking: PendingSubmits=1, PendingUnlinks=1 trce: 7-3.4[1001] Aborting endpoint dbug: 7-3.4[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: 7-3.4[1001] Unlinking: PendingSubmits=1, PendingUnlinks=1 trce: 7-3.4[1001] Aborting endpoint dbug: 7-3.4[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: 7-3.4[1001] Unlinking: PendingSubmits=1, PendingUnlinks=1 trce: 7-3.4[1001] Aborting endpoint dbug: 7-3.4[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: 7-3.4[1001] Unlinking: PendingSubmits=1, PendingUnlinks=1 trce: 7-3.4[1001] Aborting endpoint dbug: 7-3.4[1000] USBSUP_XFER_STALL -> EPIPE -> -32 trce: 7-3.4[1001]

usbipd.zip

sywong2000 avatar Sep 07 '23 14:09 sywong2000

The messages you see are normal for cancelation of a transfer. It looks like the client (WSL) is requesting data, which is then canceled again by the client. This leads to UNLINK of the outstanding read URB.

UNLINK , Aborting endpoint, USBSUP_XFER_STALL -> EPIPE -> -32 , and status -104 are all normal messages that belong to the cancelation by the client of an earlier read (or write) request. They are not errors.

dorssel avatar Sep 07 '23 14:09 dorssel

OK. Thank you! How about the URB ID jumps on URB_BULK out? I assume the URB# should be in sequence under normal operations, correct? The jumping of the URB# may be an indication on missing chunks from the host to the device?

sywong2000 avatar Sep 07 '23 15:09 sywong2000

UNLINKs are not logged (they are technically not URBs, but instead the cancelation of an earlier URB). The "URB#" that is logged is really the USBIP request number. So:

0x9fd is an out request, i.e. an URB, so it is logged
0x9fe is the UNLINK (cancelation) request for 0x9fd (not logged in PCap, as it does not support UNLINK requests, there is no format for it).
0x9ff is another out request
0xa00 is another UNLINK, for 0x9ff
0xa01 is another out request

etc.

What is interesting is that at the start of the capture, apparently the status is read back (endpoint 6) alongside the output (endpoint 4). But from the failures on, only failing outputs, no more input requests. Also: the EPIPE completion after cancelation is always 0.5 seconds after the out request. So it appears to be a timeout issue: an out request is sent, which after 0.5 seconds gets canceled before it completes. Then (of course) it acknowledges cancelation with EPIPE (that's just how it works).

Questions are:

  • why do the status read backs stop at the same point where outputs start failing
  • why does the device timeout. The earlier (successful) requests complete within 1 millisecond, And then, suddenly, they no longer complete within 500 ms.

dorssel avatar Sep 07 '23 22:09 dorssel

OK I see where the chunks started the 0.5s cancel and I do more captures including what's happening in the Windows host. I can see the URB_FUNCTION_ABORT_PIPE (0x0002) at the time the problem happens at the host computer. I do not have the source code of the driver so it's a little bit difficult to tell why it timed out.

I have also captured the camera operating under a guest virtualbox which the image download is successful, but only the USB traffic via usbpcap in Windows host. Also I capture the USB when I operate the camera directly in the Windows machine. I do see bigger chunks (>16Kb) frames for those successful attempts. So it seemed the transfer never started in the WSL...

usb_captures.zip

sywong2000 avatar Sep 08 '23 05:09 sywong2000

From the localonly capture, focusing on the bulk endpoint (endpoint 8):

1768	18.591696	2.17.8	host	USB	34	URB_BULK in
2451	22.709311	host	2.17.8	USB	27	URB_BULK in
2452	22.709394	2.17.8	host	USB	34	URB_BULK in
2505	22.924759	host	2.17.8	USB	27	URB_BULK in
2506	22.924848	2.17.8	host	USB	34	URB_BULK in
2647	24.433789	host	2.17.8	USB	27	URB_BULK in
2810	25.978284	2.17.8	host	USB	2097179	URB_BULK in
2811	25.978310	host	2.17.8	USB	27	URB_BULK in
2864	27.168742	2.17.8	host	USB	2097179	URB_BULK in
2865	27.168773	host	2.17.8	USB	27	URB_BULK in
2920	28.359342	2.17.8	host	USB	2097179	URB_BULK in
2921	28.359372	host	2.17.8	USB	27	URB_BULK in

Around the same "logical" point, using usbipd-win:

4216	34.938190	2.16.8	host	USB	34	URB_BULK in
4321	35.940501	host	2.16.8	USB	27	URB_BULK in
4322	35.940554	2.16.8	host	USB	34	URB_BULK in
4363	36.143332	host	2.16.8	USB	27	URB_BULK in
4364	36.143358	2.16.8	host	USB	34	URB_BULK in
4435	37.232748	host	2.16.8	USB	27	URB_FUNCTION_ABORT_PIPE
4436	37.232750	2.16.8	host	USB	27	URB_FUNCTION_ABORT_PIPE

It looks like the device wants to transfer 2097179 bytes (about 2 MB) per request. That is quite large (in fact, the largest I've ever seen, per transfer). Either this times out due to latency, or that's beyond some limit (unknown to me). In any case, the transfer does not complete within 0.5 seconds on usbipd-win, after which the client (i.e., WSL) cancels the request. On the other hand, the transfers take between 1 and 1.5 seconds in the localonly case. So, it looks like "something" (don't know what, Linux USBIP driver?) enforces a timeout of 0.5 seconds, but the transfer really requires >1 seconds to complete.

dorssel avatar Sep 08 '23 12:09 dorssel

Thanks for your analysis on the traffic. I certainly think that the Windows driver with its own implementation that support much larger transfer block size on the USB device if accessed locally.

About the timeout scenario, I was thinking for such case, I would probably see a larger block of transfer at the Windows host but do not see them in the guest. That's also why I captured using wireshark + UsbPcap to log the traffic in the host machine: usb_cap_usbipd_wsl2_host.pcapng.

However upon checking the two logs: the largest block I can find in the host capture was only 195 bytes and it was not even closed to the time when the transfer issue happens (the problem happened around 12:17:10 GMT+8). At the time of the problem happens, there were only chunks in sizes of 43, 27 bytes being transferred. So it seemed like the transfer may have never started (?).

** just an update if I'm not mistaken by filtering the endpoints to compare the traffic, there is one 27-byte URB request in Virtualbox VM traffic (#1627) would trigger the subsequence response for the 16Kb data coming back from the device. Comparing to the WSL guest captures, it seems that the usbipd never receives such URB request and time out after 30s? Am I interpreting this correctly?

USB Host traffic started with WSL2

No.	Time	Source	Destination	Protocol	Length	Info
10489	88.790947	host	2.16.8	USB	27	URB_BULK in
10490	88.790997	2.16.8	host	USB	34	URB_BULK in
10499	89.793834	host	2.16.8	USB	27	URB_BULK in
10500	89.793873	2.16.8	host	USB	34	URB_BULK in
10509	89.996358	host	2.16.8	USB	27	URB_BULK in
10510	89.996389	2.16.8	host	USB	34	URB_BULK in
10533	90.998272	host	2.16.8	USB	27	URB_BULK in
10534	90.998306	2.16.8	host	USB	34	URB_BULK in
10543	91.201730	host	2.16.8	USB	27	URB_BULK in
10544	91.201777	2.16.8	host	USB	34	URB_BULK in

<-- missing URB here and then after 30 secs it timed out?

12687	121.952288	host	2.16.8	USB	27	URB_FUNCTION_ABORT_PIPE
12688	121.952289	2.16.8	host	USB	27	URB_FUNCTION_ABORT_PIPE

Comparing to the chunks captured in Virtualbox VM transfer (same version of ubuntu guest, and I only capture the Windows Host traffic but not the Linux guest) - I spotted there're transfers of many sets of 16Kb blocks (to be exact 16411 bytes, that's 16384 + 27 bytes) - which are likely the URB that carries the 16Kb chunk image data with the URB headers and completed the transfer and each of them takes ~10ms. Not 100% sure about the mechanism of virtualbox USB passthrough - I just assume it's also an implementation of USB/IP.

USB Host traffic when running in VirtualBox VM

No.	Time	Source	Destination	Protocol	Length	Info
622	1.672115	2.18.8	host	USB	34	URB_BULK in
943	2.689192	host	2.18.8	USB	27	URB_BULK in
944	2.689259	2.18.8	host	USB	34	URB_BULK in
1029	2.910157	host	2.18.8	USB	27	URB_BULK in
1030	2.910185	2.18.8	host	USB	34	URB_BULK in
1627	5.309898	host	2.18.8	USB	27	URB_BULK in  <--
1708	5.709428	2.18.8	host	USB	16411	URB_BULK in  <--
1711	5.711395	host	2.18.8	USB	27	URB_BULK in
1712	5.719471	2.18.8	host	USB	16411	URB_BULK in
1713	5.719552	host	2.18.8	USB	27	URB_BULK in
1716	5.728699	2.18.8	host	USB	16411	URB_BULK in
1717	5.728893	host	2.18.8	USB	27	URB_BULK in
1720	5.738034	2.18.8	host	USB	16411	URB_BULK in
1721	5.739345	host	2.18.8	USB	27	URB_BULK in
1724	5.747929	2.18.8	host	USB	16411	URB_BULK in
1725	5.748022	host	2.18.8	USB	27	URB_BULK in
1728	5.757292	2.18.8	host	USB	16411	URB_BULK in
1729	5.757415	host	2.18.8	USB	27	URB_BULK in
1732	5.766492	2.18.8	host	USB	16411	URB_BULK in

sywong2000 avatar Sep 12 '23 08:09 sywong2000

I did the capture once again today with release 3.2.0 and there're some interesting findings. The overall process still failed. And I was able to locate the full image data chunks, but they only comes back, after the device has been re-initialized at a later time again and being configured: usbipd.zip

dmesg_20230912_01.txt is the dmesg output indi_app_log.txt is the application log usbipd_01.pcapng is the PcapNg capture

  • 2023-09-12 20:37:14 Driver start to capture an image with 10sec exposure. So expect at 20:37:24 the image will be ready for download.

  • 2023-09-12 20:37:25 Driver started reporting "ParDeviceLibUSB::In - BulkRead Failed!". The image data didn't come in as expected. At the same time dmesg started showing the unlinks: [Tue Sep 12 20:37:25 2023] vhci_hcd: unlink->seqnum 3842 [Tue Sep 12 20:37:25 2023] vhci_hcd: urb->status -104 [Tue Sep 12 20:37:26 2023] vhci_hcd: unlink->seqnum 3844 [Tue Sep 12 20:37:26 2023] vhci_hcd: urb->status -104

  • At this point, the PcapNg didn't really show anything for endpoint 8. I saw the time out occurs in endpoint 4.

  • 2023-09-12 20:39:40 I manually abort the capture. The device takes a long time to stop as it seemed still running.

  • 2023-09-12 20:39:58 Software stopped. Process terminated.

  • 2023-09-12 20:40:14 Then I started a new instance of the server.

  • 2023-09-12 20:40:21 Camera being initialized by the new process again and being setup, then:

  • At this particular time, there're a set of 19 x URB_BULK IN each size 1000065 bytes (around 976 Kb) , total around ~18Mb showing in the PcapNg. The 18Mb is also the usual image full file size. So it looks like these URB_BULK IN chunks were "delayed" until the device was being initialized the second time. At the same time, the drivers log showing a lots of "ParDeviceLibUSB::In - OK!! " which indicates the driver were able to "receive" these chunks but unable to process them because they're not associated with any capture request. I cannot be sure but seemed the initialization happened at the 2nd time release/flush those previous URB chunks queued somewhere maybe?

No.	Time	Source	Destination	Protocol	Length	Info
1682	2023-09-12 20:40:21.787241	6.4.8	host	USB	67	URB_BULK in
1687	2023-09-12 20:40:21.838999	host	6.4.8	USB	64	URB_BULK in
1688	2023-09-12 20:40:22.407024	6.4.8	host	USB	1000065	URB_BULK in
1691	2023-09-12 20:40:22.415025	host	6.4.8	USB	64	URB_BULK in
1692	2023-09-12 20:40:22.982857	6.4.8	host	USB	1000065	URB_BULK in
1695	2023-09-12 20:40:22.987156	host	6.4.8	USB	64	URB_BULK in
1696	2023-09-12 20:40:23.554518	6.4.8	host	USB	1000065	URB_BULK in
1699	2023-09-12 20:40:23.557446	host	6.4.8	USB	64	URB_BULK in
1700	2023-09-12 20:40:24.125093	6.4.8	host	USB	1000065	URB_BULK in
1703	2023-09-12 20:40:24.128214	host	6.4.8	USB	64	URB_BULK in
1704	2023-09-12 20:40:24.695626	6.4.8	host	USB	1000065	URB_BULK in
1707	2023-09-12 20:40:24.698106	host	6.4.8	USB	64	URB_BULK in
1708	2023-09-12 20:40:25.265646	6.4.8	host	USB	1000065	URB_BULK in
1711	2023-09-12 20:40:25.269358	host	6.4.8	USB	64	URB_BULK in
1712	2023-09-12 20:40:25.836214	6.4.8	host	USB	1000065	URB_BULK in
1715	2023-09-12 20:40:25.838712	host	6.4.8	USB	64	URB_BULK in
1716	2023-09-12 20:40:26.407745	6.4.8	host	USB	1000065	URB_BULK in

sywong2000 avatar Sep 12 '23 14:09 sywong2000