ttn-esp32 icon indicating copy to clipboard operation
ttn-esp32 copied to clipboard

EU transmission appears to be waiting for 1% duty cycle before transmitting

Open DylanGWork opened this issue 8 months ago • 8 comments

Issue: Device hangs for extended period during LoRa transmission I'm investigating issues from our field deployments and came across unexpected behavior during testing.

Setup: Transmission interval: every 3 minutes Data Rate set to: DR0 Region: EU Payload size: 5 bytes (Air time calculated using this tool)

Observed Behavior: The device consistently pauses (hangs) at line 397 (see screenshot below) for approximately 2 minutes before proceeding. After this delay, it quickly prints line 399 and returns to deep sleep within about one second.

Screenshots: Code with my print statements: Image Output: Image

This behaviour creates useability issues where users sometimes need to wait 3 minutes before they can setup a device as they need to go through about a 2 stage process of pressing buttons and sending data before the device will be provisioned.

Any idea why this process occurs at this point in time? It's also keeping the device on instead of in deepsleep for minutes longer than it should be. Is there a workaround?

DylanGWork avatar Apr 17 '25 04:04 DylanGWork

Update: I have successfully caught a device "hanging" and never getting past that printout of 397, it's been in this state for several hours now which represents the issue I experienced in the field. Looks like the difference is that it's in opmode=808:

Image

Any idea what could have happened here, and how to avoid it?

DylanGWork avatar Apr 17 '25 14:04 DylanGWork

I have limited understanding of the underlying LMIC library and limited understanding of the opmode. I believe that both opmode 0x808 and 0x908 mean that data should be transmitted and the channel switched. I don't understand the difference between these two modes. And I didn't quite get which one you have observed in the problematic case.

The code line you are pointing at is just the intra-task communication between the LMIC background task and foreground code. It is to be expected that it is waiting there until data has been transmitted. It is also to be expected that the device does not enter deep sleep if there is data to be transmitted. So from this information it's difficult to judge what's going on.

From you output it looks like you have turned on the debugging output of the underlying LMIC library. That's a bad idea unless you have modified printf such that it do not block. If not, it messes up the timing of the LMIC background task and it is likely that it no longer works correctly. Instead, set use the output of the ttn-esp32 library by setting LMIC_ENABLE_event_logging=1 in your project. This output is less detailed but does not interfere with the LMIC timing.

The opcodes 0x808 and 0x908 are also used when the LMIC code is waiting until the device is allowed to transmit again. So airtime could be a problem. Unfortunately, I don't understand how the LMIC code calculates the air time. It might use a different formula than the air time calculator you have linked. DR0 is very slow and results in an air time of more than 1 second for each message. And TTN recommends to not use more than 30 seconds per day. So go figure...

manuelbl avatar Apr 17 '25 15:04 manuelbl

LMIC opmodes are defined here.

0x808 = TX user data (buffered in pendTxData) & find a new channel 0x908 = TX user data (buffered in pendTxData) & prevent TX lining up after a beacon

This means, LMIC is delaying queued TX user data, while waiting for duty cycle.

You ran out of airtime. Try to use a higher DR to reduce airtime.

cyberman54 avatar Apr 18 '25 14:04 cyberman54

Cheers for the detailed replies. I've got 3 devices doing the same thing yet only one device decided to enter the 0x808 state and I've been unable to repeat it over 4 days of intense stress testing (in an isolated environment). I've been running the devices in SF7 and SF12 at 2 minute intervals for 1-2 days each test run, SF12 consistently pauses for a few minutes as described earlier during the 0x908/0x900 opmodes.

I'm still unsure of how I managed to get one device during 1 test into a different state to the other devices and not be able to repeat it. I have nothing else special happening, just a 2minute interval with 5 byte payload, ADR reactivity is set to 2 as I suspected there might be an relationship with ADR and the field issues I experience although I'm yet to find any concrete evidence of this.

I found the LMIC opmode codes in the lmic.h file but I don't understand how they conjugate to make 0x808 or 0x908. You're suggesting that 0x808 is derived from: OP_TXDATA = 0x0008, // TX user data (buffered in pendTxData) OP_NEXTCHNL = 0x0800, // find a new channel I would have expected these to make 0x0808 or something along those lines, or does it drop the 0's in front? Then 0x908 you've suggested that it's: OP_TXDATA = 0x0008, // TX user data (buffered in pendTxData) + OP_RNDTX = 0x0100, // prevent TX lining up after a beacon but that would be 0x108, but if I add in the "OP_NEXTCHNL = 0x0800, // find a new channel" then does it make 0x908?

I'll change to the logging suggestion too. Cheers for the help.

DylanGWork avatar Apr 22 '25 04:04 DylanGWork

The lmic opcode is a bitmask, so the single values (bits) are logically OR'd, that results in the value you see.

cyberman54 avatar Apr 22 '25 06:04 cyberman54

Remember, if you have ADR on, not your code, but the network controls SF, thus airtime budget. This could vary between boards due to different RSSI values seen by the network server.

cyberman54 avatar Apr 22 '25 12:04 cyberman54

I don't understant why after sending the first transmission the library configure the mode to OP_TXDATA = 0x0008, // TX user data (buffered in pendTxData) OP_NEXTCHNL = 0x0800, // find a new channel + OP_RNDTX = 0x0100,

OP_RNDTX doesn't make sense if your are using class A because there are no beacons. Taking this in mind, I implemented a "solution". OP_RNDTX is set on the function txDelay of LMIC, the modification unset this mode to avoid it. The following code shows the implementation: static void txDelay (ostime_t reftime, u1_t secSpan) { if (secSpan != 0) reftime += LMICcore_rndDelay(secSpan); if( LMIC.globalDutyRate == 0 || (reftime - LMIC.globalDutyAvail) > 0 ) { LMIC.globalDutyAvail = reftime; LMIC.opmode &= ~OP_RNDTX; //<-- cancel mode } } If someone can explain me why LMIC implement this on class A. Thank you.

jordicr2004 avatar Jun 10 '25 14:06 jordicr2004

@jordicr2004 As ttn-esp32 is an ESP-IDF wrapper for the arduino-lmic library and your question concerns the LMIC library, you are probably better off raising it there.

manuelbl avatar Jun 10 '25 18:06 manuelbl