benlink icon indicating copy to clipboard operation
benlink copied to clipboard

send_tnc_data always replies with `INCORRECT_STATE` before succeeding

Open khusmann opened this issue 11 months ago • 12 comments

If you try to send any data with benlink.client.RadioClient.send_tnc_data it will immediately reply with a INCORRECT_STATE error. If you immediately retry the command within two seconds, it will work. I have no idea why it does this. In all of my btsnoop logs, the command appears to work on the first try.

If anyone can figure out what's going on here, please let me know!

khusmann avatar Dec 25 '24 01:12 khusmann

Ah ha, it works the first try if you send the commands while simultaneously connected via rfcomm to the audio channel. Looks like you can probe for the channel layout via sdptool records <uuid>. With my UV-Pro I get this:

Service RecHandle: 0x10000
Service Class ID List:
  "PnP Information" (0x1200)

Service Name: BS AOC
Service RecHandle: 0x10001
Service Class ID List:
  UUID 128: 39144315-32fa-40db-85ed-fbfeba2d86e6
Protocol Descriptor List:
  "L2CAP" (0x0100)
  "RFCOMM" (0x0003)
    Channel: 2
Language Base Attr List:
  code_ISO639: 0x656e
  encoding:    0x6a
  base_offset: 0x100

Service Name: Voice Gateway
Service RecHandle: 0x10002
Service Class ID List:
  "Handsfree Audio Gateway" (0x111f)
  "Generic Audio" (0x1203)
Protocol Descriptor List:
  "L2CAP" (0x0100)
  "RFCOMM" (0x0003)
    Channel: 3
Profile Descriptor List:
  "Handsfree" (0x111e)
    Version: 0x0107

Service Name: SPP Dev
Service RecHandle: 0x10003
Service Class ID List:
  "Serial Port" (0x1101)
Protocol Descriptor List:
  "L2CAP" (0x0100)
  "RFCOMM" (0x0003)
    Channel: 1
Profile Descriptor List:
  "Serial Port" (0x1101)
    Version: 0x0102

Weirdly, even though the "voice gateway" is listed on channel 3, all of the audio packets show up on channel 2...

khusmann avatar Dec 26 '24 06:12 khusmann

@Ylianst in https://github.com/khusmann/benlink/issues/14 you mention:

I would get a notification that HtStatus.is_in_tx would change to false. I changed my code to use that as a way to know when to send the next packet and it seems to work.

I haven't tried waiting for HtStatus.is_in_tx -- do you find this to be a reliable approach? (Also does opening the audio rfcomm fix the problem like it did for me?)

khusmann avatar Feb 12 '25 21:02 khusmann

@khusmann I have not tried it a lot but so far, yes, it's working. Also, because I get the event change and suddenly send data starts working after that, it's a really good clue that it's on the right track. I have no idea for audio rfcomm, I have not tried that yet.

Ylianst avatar Feb 12 '25 21:02 Ylianst

Yeah, this might be something that's different on different firmware versions. (What firmware are you currently on? I think I was testing on 0.7.10 when I documented this)

For me, I would ALWAYS get INCORRECT_STATE when trying to send a message -- the only way I could send a message was to quickly follow it up with a retry (in under a second). With the audio channel open, it would always succeed first try. Is this different from what you're looking at?

I'd be really interested to hear what your device behavior is with the audio channel open.

khusmann avatar Feb 12 '25 21:02 khusmann

Just tried this with upgraded firmware 0.8.0 -- no issue with INCORRECT_STATE replies on first send, so that seems to definitely be a firmware bug. So I was incorrect assuming earlier this was an offshoot of the behavior @Ylianst was describing earlier, sorry about that!

That said, I think the proper solution on this front is to have a sending / retry algorithm that addresses both simultaneously. @Ylianst are you doing something like this?

async def send_tnc_data(self, message: bytes):
  fragments: List[MessageFragment] = split_into_fragments(message)

  for fragment in fragments:
    self.send_tnc_fragment_with_retries(fragment)

async def send_tnc_fragment_with_retries(self, fragment: MessageFragment, max_retries: int = 3, retry_delay: float = 0.2):
  curr_retry = 0

  while True:
    if self.status.is_in_tx:
      await self.wait_for_status(is_in_tx = False)
    try:
      await self.send_tnc_fragment(fragment)
    except RadioError as re:
      if curr_retry < max_retries:
        asyncio.sleep(retry_delay)
        curr_retry += 1
      else:
        raise

That should handle both the buggy INCORRECT_STATE issue on old firmwares, as well as wait for is_in_tx = False. Any other pieces I'm missing here?

khusmann avatar Feb 16 '25 20:02 khusmann

@khusmann One thing I did possibly notice is that if you get INCORRECT_STATE while sending a fragment that is not the first one, that is not recoverable and you need to skip all remaining fragments. However, I have not encountered this often and could be wrong, but in my case INCORRECT_STATE on a fragment id that is not zero will cause all me to skip remaining fragments.

Ylianst avatar Feb 16 '25 21:02 Ylianst

@khusmann One thing I did possibly notice is that if you get INCORRECT_STATE while sending a fragment that is not the first one, that is not recoverable and you need to skip all remaining fragments.

from my testing using https://github.com/Jakeler/ble-serial i got the idea that this "skipping" issue is also triggered by sending large fragments (around 500 bytes).

gretel avatar Feb 17 '25 17:02 gretel

However, I have not encountered this often and could be wrong, but in my case INCORRECT_STATE on a fragment id that is not zero will cause all me to skip remaining fragments.

@Ylianst That makes sense -- if someone hits the PTT or something in the middle of an xmit and interrupts the sequence, you don't want to continue... otherwise it'd look like two disjoint messages to someone on the receiving end.

i got the idea that this "skipping" issue is also triggered by sending large fragments (around 500 bytes).

interesting. In my exploring the decompiled HT app it looked like message fragments were limited to 53 bytes.

@gretel Just to confirm, when you were testing with ble-serial were you sending packets via KISS or were you putting together native HT_SEND_DATA commands?

khusmann avatar Feb 17 '25 18:02 khusmann

@gretel Just to confirm, when you were testing with ble-serial were you sending packets via KISS or were you putting together native HT_SEND_DATA commands?

via KISS.

gretel avatar Feb 18 '25 09:02 gretel

@khusmann I have noticed that I can send more than 53 bytes of data on the native API, I think I could do up to 250 or 300-ish. However, but have not explored the consequences of this and sticking to the smaller value right now. At some point, I will implement a file transfer system and then, I will certainly play with these values a lot more.

Ylianst avatar Feb 18 '25 17:02 Ylianst

@khusmann Do you happen to know if it's possible to send multiple data frames without the radio releasing the PTT? So, with minimal gap between the frames? My torrent system sends a lot of data and it would be great to merge the frames into a stream of frames. Would KISS have this behavior? - Thanks.

Ylianst avatar Apr 03 '25 06:04 Ylianst

@Ylianst nope, don't know how to do this other than sending via your own audio stream

khusmann avatar Apr 04 '25 14:04 khusmann