facedancer Raw gadget backend

This implements #15 by updating the raw gadget backend @xairy's v2 fork at https://github.com/xairy/Facedancer/tree/rawgadget to API v3.

It works with the serial and rubber duck examples, at least when using the patched driver from https://github.com/xairy/raw-gadget/tree/dev. So I'm not sure if you want to merge it until those changes get upstreamed.

Though I haven't got it to work with usbproxy and my mouse yet, so there is still more work to do.

May 23 '25 15:05 akvadrako

Awesome, thank you for working on this!

I think we need to figure out what to do with those out-of-tree Raw Gadget changed first indeed.

In Raw Gadget, the syscalls for receiving (and sending) control/non-control transfers block until the transfer is sent/received. This is a problem for Facedancer, as it does not expect its service_irqs call to block (control transfers are handled in service_irqs for Raw Gadget backend). And I believe Facedancer also doesn't want the non-control read transfers to block (is this true?)

There are 2 options that I see to resolve this:

Change the Facedancer code to make it work with the blocking backend calls. Or figure out a way to deal with the blocking on the backend side (spawn threads for handling transfers?). If this is possible, I would prefer this approach.
If 1 is not possible, the best solution would be to add a proper non-blocking I/O support in Raw Gadget. The current out-of-tree patches are more of a hack to make sending/receiving syscalls bail out on timeouts.

I think we should first explore if 1 is possible. If not, we can think about the easiest approach to implement 2.

May 25 '25 16:05 xairy

I agree that 1 sounds best, since it can be used right away with current kernels. I don't know enough about the Facedancer design to know if it's reasonable to make a backend that only supports blocking calls.

But it would be pretty clean to create a single daemon thread that handles all blocking calls when blocking=False. As I understand, they generally should only block for a few ms and it seems like FD doesn't use the results, so there isn't much to coordinate.

May 25 '25 19:05 akvadrako

I think we'll need a separate daemon thread for USB_RAW_IOCTL_EVENT_FETCH/service_irqs and then separate threads for the non-control endpoints, possibly only for the IN ones — these are the ones that can block for a long time.

I added a section to the Raw Gadget readme about the API usage and about the blocking of ioctls in particular. Hopefully, this would be helpful.

It would be great if someone from the Facedancer side could take a look at the Raw Gadget API (@antoinevg) and advice on what would be the best way to integrate the two together.

(Btw, proxying the mouse likely fails due to this missing from the backend implementation. We can add it later, I think we should focus on the blocking issue first.)

May 27 '25 11:05 xairy

@xairy I pushed some changes to make it work with the in-kernel raw_gadget module. I also fixed a bunch of issues I encountered trying to make some basic proxying work, though I can imagine the FD devs would prefer those be in a separate PR.

I ended up using 1-2 threads per endpoint to avoid any manually polling. I'm thinking it can probably be simplified a bit. Right now I have separate control read / write threads, but can I assume control always goes EVENT → ACK or REPLY, repeat? Maybe with a small timeout if no reply is given.

And would it be safe to assume ep_write() doesn't block for a long time?

Jun 02 '25 14:06 akvadrako

Okay, I've simplified it a bit and done some more debugging, so for now from my side it's ready for review.

Jun 09 '25 16:06 akvadrako

@xairy I pushed some changes to make it work with the in-kernel raw_gadget module. I also fixed a bunch of issues I encountered trying to make some basic proxying work,

Amazing!

though I can imagine the FD devs would prefer those be in a separate PR.

Yes, please. At least split the PR into separate commits for the Raw Gadget backend and for the proxy changes. And probably separate commits for various clean-ups of the other code.

I ended up using 1-2 threads per endpoint to avoid any manually polling. I'm thinking it can probably be simplified a bit. Right now I have separate control read / write threads, but can I assume control always goes EVENT → ACK or REPLY, repeat? Maybe with a small timeout if no reply is given.

On the high level, the design looks reasonable. I'll dig into the code a bit later.

Btw, from my experience, proxying real devices is a great way to test the emulation/backend implementation. So if we run into any issues with a real device, we'll know if we have anything to fix.

And would it be safe to assume ep_write() doesn't block for a long time?

As long as the host keeps polling data, it should not block. So I think it's a reasonable assumption.

I've tried running the ftdi-echo.py example with Raspberry Pi 4 B — it works perfectly!

But when running the rubber-ducky.py example, I'm getting the following error:

$ ./examples/rubber-ducky.py 
INFO    | __init__       | Starting emulation, press 'Control-C' to disconnect and exit.
INFO    | hydradancer    | this is hydradancer hi
INFO    | rubber-ducky   | Beginning message typing demo...
INFO    | rawgadget      | gadget reset
INFO    | device         | Host issued a bus reset; resetting our connection.
INFO    | rawgadget      | ignoring reset request
INFO    | rawgadget      | gadget resumed
INFO    | rawgadget      | gadget reset
INFO    | device         | Host issued a bus reset; resetting our connection.
INFO    | rawgadget      | ignoring reset request
INFO    | rawgadget      | gadget resumed
INFO    | rawgadget      | applying configuration
INFO    | rawgadget      | ep_enable: handle=12
Exception in thread ep-3-recv:
Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "/home/pi/.local/lib/python3.11/site-packages/facedancer/backends/rawgadget.py", line 813, in _receiver
    self.backend.connected_device.handle_data_requested(self.ep, timeout=1000)
TypeError: USBKeyboardDevice.handle_data_requested() got an unexpected keyword argument 'timeout'

Before I dig into the code, which parts are currently working? To what extent does proxying work? Which UDC did you use?

Jun 10 '25 22:06 xairy

TypeError: USBKeyboardDevice.handle_data_requested() got an unexpected keyword argument 'timeout'

I overlooked that, but I'll have to think about it. Currently the call to request data from the device has a timeout based on the endpoint update period, which can be as low as 1ms. Since it should block and often returns nothing, that's not ideal. So I added a timeout parameter.

An alternative would be to edit the endpoint interval before calling handle_data_requested or just live with the 1ms timeout.

Before I dig into the code, which parts are currently working? To what extent does proxying work? Which UDC did you use?

It's working in all my tests. My target is my mouse and I also did some testing with a thumbdrive. I only use dummy_hdc.

Jun 10 '25 23:06 akvadrako

I overlooked that, but I'll have to think about it. Currently the call to request data from the device has a timeout based on the endpoint update period, which can be as low as 1ms. Since it should block and often returns nothing, that's not ideal. So I added a timeout parameter.

Ah, I see. But then other backends seem to call handle_data_requested (handle_nak actually) in service_irqs and that doesn't block...

Jun 10 '25 23:06 xairy

It's working in all my tests. My target is my mouse and I also did some testing with a thumbdrive. I only use dummy_hdc.

Tested a mouse and a thumb drive with R Pi — proxying also works!

I think we'll still need to better handle altsetting changes for some of the tricker devices, but this is already awesome!

Jun 11 '25 00:06 xairy

Ah, I see. But then other backends seem to call handle_data_requested (handle_nak actually) in service_irqs and that doesn't block...

handle_data_requested does block since we are waiting on data from the device. For interrupts that is 1ms for my mouse, and I suppose can be around 10ms for some. If we use that timeout we'll need to poll it very frequently - that doesn't seem like a good design to me.

For bulk endpoints the timeout defaults to 1 second, which will block the main thread if we do it there. But maybe for bulk endpoints NAK is only called with data known to be available?

This commend might be relevant:

          # TODO: Currently, we use this for _all_ non-control transfers, as we
          # don't e.g. periodically schedule isochronous or interrupt transfers.
          # We probably should set up those to be independently scheduled and
          # then limit this to only bulk endpoints.
          self._proxy_in_transfer(endpoint)

Jun 11 '25 08:06 akvadrako

I've separated the proxy fixes into #162, though it'll be harder to test now.

Jun 11 '25 13:06 akvadrako

The code is starting to look good from the Raw Gadget point of view (there are a few more clean-ups to be done, but I'll just do them later myself to avoid nitpicking).

@antoinevg could PTAL and see if the Facedancer API usage in the backend implementation makes sense?

The main concern I have is wrt calling handle_data_requested in a thread. Is it intended to be called in service_irqs even though it might block when proxying?

Jun 11 '25 15:06 xairy

Okay, I pushed some more fixes. I tried running the unit tests and they kinda worked. The remaining failures might not be real since it seems like it's due to less than the expected amount of data being returned by bulk read calls, but it's just a timing thing.

I found that ep_write often didn't write the whole data so I wrapped it in a loop.

I also created issue #163 about a bug in the Facedancer code. OUT control transfers call ack() which calls send_on_endpoint() which is for IN endpoints. Only the proxy calls ack_status_stage(), which actually raw gadget doesn't use. So for now I just ignore these acks when the last_control_direction is OUT.

Jun 12 '25 10:06 akvadrako

Actually I understand it a bit better now. I implemented send_on_control_endpoint which is actually for all control requests (not just IN) and I think ack_status_stage is just obsolete.

Jun 12 '25 11:06 akvadrako

Could you also squash all changes into one and rebease onto the main branch? (But you can keep adding new ones as new patches to make it easier to see the diff). Right now rebasing is painful, as the first few patches are based on a pre-v3 revision.

Jun 12 '25 18:06 xairy

I squashed the changes so it's easier to work off of. I don't plan to work on this further without feedback from the maintainers. I think it works well enough for a first pass so I would prefer to have it merged as is and have subsequent work done in follow ups.

If you want to keep working on it maybe it's best to make another PR.

Jun 23 '25 01:06 akvadrako

Made a PR with fixes and improvements: #164.

@antoinevg please look at that one when you get a chance.

Jun 30 '25 00:06 xairy

Closing in favor of #164

Jul 17 '25 15:07 antoinevg