coreMQTT Problem of unnecessary waiting time for reception

I use the FreeRTOS-LTS 2022210.01. I realized that CoreMQTT had a idle period every data paket of OTA when I do OTA of FreeRTOS.

I doesn't happen when I use previsou version of LTS of FreeRTOS of CoreMQTT ver 1.

It affects the period of OTA because the wast idle time in here. Below figure is the waveform I got do the OTA using FreeRTOS v202210.01 with CoreMQTT v2. As you can see the idle period is here at the end of OTA packets.

I checked why this issue is happned, then the below code is requested the Receive API not to check the total packet size. Then, the Receive API is waiting until timeout, because the recevied packet size is smaller than the upper API requested.

https://github.com/FreeRTOS/coreMQTT/blob/03290fe0274e1c9cee5f3608e197daedbfdd1e1f/source/core_mqtt.c#L1685

CoreMQTT receive API says the a byte recevie doesn't block. So we have implemeted it. But V2 of CoreMQTT happened avobe issue.

https://freertos.github.io/coreMQTT/v2.1.1/group__mqtt__callback__types.html#ga227df31d6daf07e5d833537c12130167

Sep 14 '23 07:09 KeitaKashima

Offhand, this looks like expected behavior to me. If you are receiving more than one byte from the transport interface, then blocking may happen. In this case it looks like it does happen. Are you somehow requesting only a single byte when getting to the end of the OTA packets?

Phrased differently - Why is this idle time unexpected? Why shouldn't it appear here?

Sep 15 '23 16:09 kstribrnAmzn

Hi kstribrnAmzn, thank you for your reply.

My explain was not good, the single byte check was performed by CoreMQTT v1. I think it was the way to get the efficiency throughput of the receivce communication.

The MQTT Agent and CoreMQTT v1 did below process, then we could skip the waste wait for waiting the receive data untile timeout.

CoreMQTT checked 1byte packet
If the 1byte packet could be received, it receives remain data. If there is no 1byte packet data at 1), the receive process exits soon bacuase coreMQTT Receive API said below recommendation. And it didn't do the receive operation becasue the data is not coming that.

https://freertos.github.io/coreMQTT/v2.1.1/group__mqtt__callback__types.html#ga227df31d6daf07e5d833537c12130167

However, v2 of CoreMQTT requests 5KB data without checking of 1byte packet if the data is comming or not. So, every receive sequence has the waist time to wait the data even though the data is not coming.

Sep 19 '23 08:09 KeitaKashima

Hi @kstribrnAmzn

This behavious is related to OTA update. I faced the same issue. I wanted to speed up the OTA feature with core-mqtt v1 before. By some investigations, I realized that after device received a data block, it had a request to download next data block. The period from having a request to publish it is mqttexampleTRANSPORT_SEND_RECV_TIMEOUT_MS in mqtt_agent. That means we need to wait mqttexampleTRANSPORT_SEND_RECV_TIMEOUT_MS (blocking happens here).

With core-mqtt v1, when requesting to read a single byte, the receive implement may block a timeout priod(mqttexampleTRANSPORT_SEND_RECV_TIMEOUT_MS- I dont remember exactly the config). So, I implemented the receive function to support not block the request a single byte. If a single byte receives, change timeout in receive function to 1ms. The OTA update speed will be improved for sure without modifying mqttexampleTRANSPORT_SEND_RECV_TIMEOUT_MS.

And I applied it to core-mqtt v2.1.1, it did not work properly because core-mqtt v2 does not request a single byte anymore. As you mentioned above, it looks like it does happen because I am receiving more than one byte from the transport interface. How can I reduce the timeout block with core-mqtt v2.1.1 without modifying mqttexampleTRANSPORT_SEND_RECV_TIMEOUT_MS?

Next question : Is this material fixed? As my understanding, core-mqtt v2 does not request a single byte anymore. Why having a recommend that the transport receive implementation does NOT block when requested to read a single byte?

https://freertos.github.io/coreMQTT/v2.1.1/group__mqtt__callback__types.html#ga227df31d6daf07e5d833537c12130167

Sep 19 '23 09:09 VanNamDinh

Hi @kstribrnAmzn

Any update for this issue?

Feb 08 '24 02:02 VanNamDinh

@KeitaKashima, @VanNamDinh thank you for your patience. I didn't mean to keep you waiting so long. The notifications slipped by me. I'm going to tag my coworker, @AniruddhaKanhere, in this reply as he was one of the most significant contributors in CoreMQTT v2. I'm hoping he'll keep my answer honest.

Unfortunately our documentation is misguiding. "However, v2 of CoreMQTT requests 5KB data without checking of 1byte packet if the data is comming or not. So, every receive sequence has the waist time to wait the data even though the data is not coming." isn't quite correct. The TransportRecv_t should return as many bytes as the network interface has at the time of being called. CoreMQTT will then parse the received bytes into the corresponding MQTT packets. CoreMQTT will then return a NoDataAvailable or NeedMoreBytes error if the MQTT packet is incomplete. There should be no wasted time.

The single byte packet read of CoreMQTT v1 was removed as this encouraged packet fragmentation. With CoreMQTT v2.0.0+ the MQTT packet is returned only when the entire MQTT packet has been read from the transport interface.

Edit: "bytes" not "bites"

Feb 08 '24 22:02 kstribrnAmzn

I will create a PR updating the documentation here. The transport receive function should no longer block in any condition. This method should return whatever bytes it has (this could be anywhere from 0 to the requested amount). CoreMQTT will then assemble the MQTT packet out of the received bytes and return it when complete. Any bytes from the next MQTT packet will remain in the CoreMQTT buffer until all bytes needed for that packet have been received.

Feb 08 '24 22:02 kstribrnAmzn

@kstribrnAmzn said:

The single byte packet read of CoreMQTT v1 was removed as this encouraged packet fragmentation. With CoreMQTT v2.0.0+ the MQTT packet is returned only when the entire MQTT packet has been read from the transport interface.

I would like to make a small correction. coreMQTT v1 read one byte in the first go to determine the packet type, then it would read one byte at a time to get the packet length, and in the end it would read the whole packet. This lead to 6 calls per packet at the most and 3 calls per packet at the least to the network interface receive function making it inefficient. Additionally, when the library tried to read the whole packet (lets say 100 bytes), but the network interface had fragmented the packet and returned only 50 bytes, the library would return an error which would lead to killing of the MQTT connection. Which is not correct.

coreMQTT v2 tries to read as many bytes as possible and then parses all the data that it received from the network interface. In case the library receives an incomplete packet (same as above - 50 bytes out of 100), the error code MQTTNeedMoreBytes is returned to let the caller know that MQTTProcessLoop should be called again so that the remaining packet can be retrieved from the transport interface.

Feb 08 '24 22:02 AniruddhaKanhere

@kstribrnAmzn

Thank you for your reply.

The TransportRecv_t should return as many bytes as the network interface has at the time of being called. CoreMQTT will then parse the received bytes into the corresponding MQTT packets.

I got the right policy not to wait for the received data in the TransportRecv_t.

However, the TLS_FreeRTOS_Connect is needed to set the wait values for send and receive, as shown below.

I thought this wait time was for waiting comming the data when we called the TransportRecv_t.

Where should the wait time of TLS_FreeRTOS_Connect be affected?

https://www.freertos.org/mqtt/server-authentication-mqtt-example.html

Feb 09 '24 01:02 KeitaKashima

Hello @KeitaKashima,

I recommend separating both the timeout values in the call to TLS_FreeRTOS_Connect function. The send timeout should be used when sending data using transport send functions - it can be non-zero. However, the receive timeout must be 0 as I mentioned previously.

Feb 11 '24 06:02 AniruddhaKanhere

Hello @KeitaKashima, I take it from your reaction (👍) to my post that the above issue is resolved. I shall be closing this thread.

If you'd feel that the issue persists or the solution was not satisfactory, please feel free to reopen this thread or open a new one.

May 10 '24 16:05 AniruddhaKanhere

Hi @AniruddhaKanhere ,

Yes, I could solve the issue by following your idea. So I think it is okay to close this ticket. I would like you to update the demo code to set separating timeout of this in the future demo.

May 13 '24 01:05 KeitaKashima