esp-idf
esp-idf copied to clipboard
Wifi performance of ESP32-S3 really bad, lot worse compared to ESP8266 (IDFGH-13236)
Answers checklist.
- [x] I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
- [X] I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
- [X] I have searched the issue tracker for a similar issue and not found a similar issue.
IDF version.
v5.2.2
Espressif SoC revision.
ESP32-S3 0.1
Operating System used.
Linux
How did you build your project?
Command line with idf.py
If you are using Windows, please specify command line type.
None
Development Kit.
Wemos/Lolin S3 mini
Power Supply used.
USB
What is the expected behavior?
I expect to see payload transfer speed at least 1/10'th of the connection speed, so in the range of 5400 kbits/s, 675 kbyte/s. On the ESP8266, with a similar setup (sending or receive 4k blocks over either tcp or udp, same wireless network), I can obtain around 800 kbytes/s (depending on the range and interference, but there is more handling involved there). On the ESP32 I can obtain little more than 200 kbyte/s to ESP32 and 600 kbyte/s from ESP32.
What is the actual behavior?
Very low network performance, much lower than ESP8266 in the same setup. Connection to access point is exactly the same, as is the rest of the infrastructure. The only real difference is that ESP8266 uses native LWIP callback API while the ESP32 image uses the LWIP POSIX API.
Note: you cannot leave out the "ACK" stuff, otherwise the non-ESP32 side will just queue up everything in memory and then report the test as ready, without having sent a single byte yet. The "ACK"-ing introduces a bit of lag, I am aware of that.
I do not see any errors on the wireless controller for this association. I don't think it's an RF issue. Looks more like an issue within LWIP or the IDF.
Steps to reproduce.
Use a very simple program like below. Use simple POSIX socket calls like socket/bind/listen/accept/send/receive/close. Use a client that sends 4k blocks upon reception of the word "ACK" or use a client that receives 4k blocks whenever it sends "ACK". Use default values for idf configuration. I've tried many and it doesn't matter that much. Also use of SPIRAM doesn't matter much.
For the full source code of the ESP32-S3 image, see here: https://github.com/eriksl/esp32. The performance testing code is currently disabled, adjust init.c to enable it. For the client I used, see here: https://github.com/eriksl/e32if
#include <stdint.h>
#include <stdbool.h>
#include <sys/socket.h>
#include "perftest.h"
#include "string.h"
#include "cli-command.h"
#include "log.h"
#include "util.h"
static bool inited = false;
enum
{
//malloc_type = MALLOC_CAP_INTERNAL
malloc_type = MALLOC_CAP_SPIRAM
};
static void run_tcp_receive(void *)
{
enum { size = 4096 };
char *receive_buffer;
int accept_fd;
struct sockaddr_in6 si6_addr;
socklen_t si6_addr_length;
int length;
int tcp_socket_fd;
static const char *ack = "ACK";
enum { attempts = 8 };
unsigned int attempt;
assert(inited);
receive_buffer = heap_caps_malloc(size, malloc_type);
memset(&si6_addr, 0, sizeof(si6_addr));
si6_addr.sin6_family = AF_INET6;
si6_addr.sin6_port = htons(9); // discard
assert((accept_fd = socket(AF_INET6, SOCK_STREAM, 0)) >= 0);
assert(bind(accept_fd, (const struct sockaddr *)&si6_addr, sizeof(si6_addr)) == 0);
assert(listen(accept_fd, 0) == 0);
for(;;)
{
si6_addr_length = sizeof(si6_addr);
if((tcp_socket_fd = accept(accept_fd, (struct sockaddr *)&si6_addr, &si6_addr_length)) < 0)
{
log_format_errno("perftest: accept fails: %d", tcp_socket_fd);
continue;
}
assert(sizeof(si6_addr) >= si6_addr_length);
for(;;)
{
length = recv(tcp_socket_fd, receive_buffer, size, 0);
if(length <= 0)
{
log_format("perftest tcp recv: %d", length);
break;
}
for(attempt = attempts; attempt > 0; attempt--)
{
length = send(tcp_socket_fd, ack, sizeof(ack), 0);
if(length == sizeof(ack))
break;
log_format("perftest tcp send ack: %d, try %d", length, attempt);
vTaskDelay(100 / portTICK_PERIOD_MS);
}
if(attempt == 0)
log("perftest tcp send ack: no more tries");
}
close(tcp_socket_fd);
}
}
static void run_tcp_send(void *)
{
enum { size = 4096 };
char *send_buffer;
int accept_fd;
struct sockaddr_in6 si6_addr;
socklen_t si6_addr_length;
int length;
int tcp_socket_fd;
static const char *ack = "ACK";
enum { attempts = 8 };
unsigned int attempt;
assert(inited);
send_buffer = heap_caps_malloc(size, malloc_type);
memset(&si6_addr, 0, sizeof(si6_addr));
si6_addr.sin6_family = AF_INET6;
si6_addr.sin6_port = htons(19); // chargen
assert((accept_fd = socket(AF_INET6, SOCK_STREAM, 0)) >= 0);
assert(bind(accept_fd, (const struct sockaddr *)&si6_addr, sizeof(si6_addr)) == 0);
assert(listen(accept_fd, 0) == 0);
for(;;)
{
si6_addr_length = sizeof(si6_addr);
if((tcp_socket_fd = accept(accept_fd, (struct sockaddr *)&si6_addr, &si6_addr_length)) < 0)
{
log_format_errno("perftest: accept fails: %d", tcp_socket_fd);
continue;
}
assert(sizeof(si6_addr) >= si6_addr_length);
for(;;)
{
length = recv(tcp_socket_fd, send_buffer, sizeof(ack), 0);
if(length <= 0)
{
log_format("perftest tcp revc 2: %d", length);
break;
}
for(attempt = attempts; attempt > 0; attempt--)
{
length = send(tcp_socket_fd, send_buffer, size, 0);
if(length == size)
break;
if((length < 0) && ((errno == ENOTCONN) || (errno == ECONNRESET)))
goto abort;
log_format_errno("perftest tcp send 2: %d, try %d", length, attempt);
vTaskDelay(100 / portTICK_PERIOD_MS);
}
if(attempt == 0)
log("perftest tcp send 2: no more tries");
}
abort:
close(tcp_socket_fd);
}
}
static void run_udp_receive(void *)
{
enum { size = 4096 };
char *receive_buffer;
struct sockaddr_in6 si6_addr;
socklen_t si6_addr_length;
int length;
int udp_socket_fd;
static const char *ack = "ACK";
enum { attempts = 8 };
unsigned int attempt;
assert(inited);
receive_buffer = heap_caps_malloc(size, malloc_type);
memset(&si6_addr, 0, sizeof(si6_addr));
si6_addr.sin6_family = AF_INET6;
si6_addr.sin6_port = htons(9); // discard
assert((udp_socket_fd = socket(AF_INET6, SOCK_DGRAM, 0)) >= 0);
assert(bind(udp_socket_fd, (const struct sockaddr *)&si6_addr, sizeof(si6_addr)) == 0);
for(;;)
{
si6_addr_length = sizeof(si6_addr);
length = recvfrom(udp_socket_fd, receive_buffer, size, 0, (struct sockaddr *)&si6_addr, &si6_addr_length);
assert(sizeof(si6_addr) >= si6_addr_length);
if(length <= 0)
{
log_format("perftest udp recv: %d", length);
continue;
}
for(attempt = attempts; attempt > 0; attempt--)
{
length = sendto(udp_socket_fd, ack, sizeof(ack), 0, (const struct sockaddr *)&si6_addr, si6_addr_length);
if(length == sizeof(ack))
break;
log_format("perftest udp send ack: %d, try %d", length, attempt);
vTaskDelay(100 / portTICK_PERIOD_MS);
}
if(attempt == 0)
log("perftest udp send ack: no more tries");
}
close(udp_socket_fd);
}
static void run_udp_send(void *)
{
enum { size = 4096 };
char *send_buffer;
struct sockaddr_in6 si6_addr;
socklen_t si6_addr_length;
int length;
int udp_socket_fd;
static const char *ack = "ACK";
enum { attempts = 8 };
unsigned int attempt;
assert(inited);
send_buffer = heap_caps_malloc(size, malloc_type);
memset(&si6_addr, 0, sizeof(si6_addr));
si6_addr.sin6_family = AF_INET6;
si6_addr.sin6_port = htons(19); // chargen
assert((udp_socket_fd = socket(AF_INET6, SOCK_DGRAM, 0)) >= 0);
assert(bind(udp_socket_fd, (const struct sockaddr *)&si6_addr, sizeof(si6_addr)) == 0);
for(;;)
{
si6_addr_length = sizeof(si6_addr);
length = recvfrom(udp_socket_fd, send_buffer, sizeof(ack), 0, (struct sockaddr *)&si6_addr, &si6_addr_length);
assert(sizeof(si6_addr) >= si6_addr_length);
if(length <= 0)
{
log_format("perftest udp recv 2: %d", length);
continue;
}
for(attempt = attempts; attempt > 0; attempt--)
{
length = sendto(udp_socket_fd, send_buffer, size, 0, (const struct sockaddr *)&si6_addr, si6_addr_length);
if(length == size)
break;
log_format("perftest udp send 2: %d, try %d", length, attempt);
vTaskDelay(100 / portTICK_PERIOD_MS);
}
if(attempt == 0)
log("perftest udp send 2: no more tries");
}
close(udp_socket_fd);
}
void perftest_init(void)
{
assert(!inited);
inited = true;
if(xTaskCreatePinnedToCore(run_tcp_receive, "perf-tcp-recv", 2 * 1024, (void *)0, 1, (TaskHandle_t *)0, 1) != pdPASS)
util_abort("perftest: xTaskCreatePinnedToNode tcp receive");
if(xTaskCreatePinnedToCore(run_tcp_send, "perf-tcp-send", 2 * 1024, (void *)0, 1, (TaskHandle_t *)0, 1) != pdPASS)
util_abort("perftest: xTaskCreatePinnedToNode tcp send");
if(xTaskCreatePinnedToCore(run_udp_receive, "perf-udp-recv", 2 * 1024, (void *)0, 1, (TaskHandle_t *)0, 1) != pdPASS)
util_abort("perftest: xTaskCreatePinnedToNode udp receive");
if(xTaskCreatePinnedToCore(run_udp_send, "perf-udp-send", 2 * 1024, (void *)0, 1, (TaskHandle_t *)0, 1) != pdPASS)
util_abort("perftest: xTaskCreatePinnedToNode udp send");
}
Debug Logs.
No response
More Information.
No response
Hi @eriksl, we have an iperf example in IDF, could you please try that and share throughput numbers?
Also please share sniffer capture of that instance if possible.
Hi @eriksl
You can also refer to IDF docs https://docs.espressif.com/projects/esp-idf/en/v5.2.2/esp32s3/api-guides/wifi.html#how-to-improve-wi-fi-performance to improve Wi-Fi throughput.
Hi @eriksl
You can also refer to IDF docs https://docs.espressif.com/projects/esp-idf/en/v5.2.2/esp32s3/api-guides/wifi.html#how-to-improve-wi-fi-performance to improve Wi-Fi throughput.
I am aware of this document. I tried all of these, with very little change in performance.
The key point remains:
I am using a dead-simple loop of accept()->read()->repeat or accept()->write()->repeat, I really think I should get a much better performance. On both TCP and UDP. So TCP-specific options aren't really significant here. Or I'd like to have a statement that using the LWIP POSIX interface a good performance is not possible. Which would explain why it's so much faster on the ESP8266 where I am using the LWIP native interface.
Hi @eriksl, we have an iperf example in IDF, could you please try that and share throughput numbers?
Also please share sniffer capture of that instance if possible.
Can't test. When running I get "Writing to serial is timing out. Please make sure that your application supports an interactive console and that you have picked the correct console for serial communication." every time. I can't type anything.
Probably the same issue with console on USB console on ESP32-S3 I reported earlier.
FWIW the iperf example uses the same API (POSIX) as I do (connect, sendto, etc.)
Hi @eriksl
Can't test. When running I get "Writing to serial is timing out. Please make sure that your application supports an interactive console and that you have picked the correct console for serial communication." every time. I can't type anything.
Probably the same issue with console on USB console on ESP32-S3 I reported earlier.
If I understand correctly, the iperf application uses UART as the default input and output for the console. If you want to use USB, you need to enable the USB Serial Options(https://github.com/espressif/esp-idf/blob/af25eb447e3330c21e3b38e91db16332056882b2/components/esp_system/Kconfig#L237) in Menuconfig.
I copied the config file I am using for my own firmware (which sees the low performance). This configuration has, of course, already console on USB-JTAG enabled. After copying the config, I ran idf.py menuconfig so a composite config could be generated.
@eriksl Could you upload the full logs of throughput test ?
Hi @eriksl
I made a preliminary comparison https://github.com/eriksl/esp32/blob/master/s3/develop/sdkconfig Compared to the default configuration of iperf, there seem to be many differences, such as CPU frequency, WIFI configuration, lwip configuration, and other items that have a significant impact on performance. Currently, we hope that you can try to run our IDF iperf and check if there are any obvious hardware abnormalities.
ok will do that later. Currently not at home...
@MaxwellAlan I have tested with an extensive set of combinations of sdk config items, including CPU speed, cache sizes, LWIP options, SPRAM options and wlan options. They make a bit difference, enough to confirm that they changed, but the throughput remains very bad nonetheless. Traffic originating from the ESP32 isn't that bad actually, it's almost as fast as the ESP8266. But traffic to be received by the ESP32 is really bad. Changing the IDF options results in minimal performance change, ranging from 150 kbyte/sec to 250 kbyte/sec (1.2 Mbps - 2.0 Mbps), while I can reach 500 kbyte/s on my ESP8266 (4.0 Mbps).
Traffic originating from the ESP32 isn't that bad actually, it's almost as fast as the ESP8266. But traffic to be received by the ESP32 is really bad. Changing the IDF options results in minimal performance change, ranging from 150 kbyte/sec to 250 kbyte/sec (1.2 Mbps - 2.0 Mbps), while I can reach 500 kbyte/s on my ESP8266 (4.0 Mbps).
@MaxwellAlan any comments?
If anyone could get the iperf image working on the ESP-S3 using USB jtag/serial console working, I'd be grateful too...
@eriksl
sorry for late reply ,I test esp32s3 USB jtag with idf v5.2.2 iperf in shield box
tcp throughput is ok
@eriksl you can also use iperf example test just enable ESP_CONSOLE_USB_SERIAL_JTAG in menconfig can be ok
Of course I did. And it doesn't work.
@eriksl can you provide the err log when you enable ESP_CONSOLE_USB_SERIAL_JTAG but still work in example iperf
See here https://github.com/espressif/esp-idf/issues/14171#issuecomment-2228352494. There is no build error, it just doesn't work. It looks like the iperf image fetches it's input directly from one of the UARTs and doesn't recognise/use the USB JTAG UART.
See here #14171 (comment). There is no build error, it just doesn't work. It looks like the iperf image fetches it's input directly from one of the UARTs and doesn't recognise/use the USB JTAG UART. This occurs when your device doesn't print anything, and is usually seen with programs that don't have a follow-up action. in iperf example,this issue will not happen normal
I might try again with the newest stable IDF version. I know there have been a few fixes in this area. But without having had a look at the code, I really suspect the iperf code assumes having a real UART connected and not UART emulation over USB-JTAG.
@eriksl I noticed that you are not using our official development board, because there is something special about the usb and the chip, I'm not sure that the unofficial development board has handled it well, maybe you can experiment with the official s3 development board or you can contect with the wemos
There is nothing wrong with the Wemos. I am using it all of the time with my own code, USB UART works like a charm. But apparently the iperf code doesn't handle it, doesn't handle it well.
It looks like the iperf image fetches it's input directly from one of the UARTs and doesn't recognise/use the USB JTAG UART.
Shouldn't be the case, as long as you enable CONFIG_ESP_CONSOLE_USB_SERIAL_JTAG option in menuconfig:
https://github.com/espressif/esp-idf/blob/3b8741b172dc951e18509698dee938304bcf1523/examples/wifi/iperf/main/iperf_example_main.c#L41-L44
There is nothing wrong with the Wemos. I am using it all of the time with my own code, USB UART works like a charm.
The point is that Wi-Fi performance is heavily related to factors such as PCB design, power supply quality, interference between the RF path and other high-speed signals, and so on. Purely digital functions such as USB 1.1 interface and the CPU / peripherals operation are influenced by these factors to a much smaller degree. So it is not unreasonable to try to use a different devboard to rule out such PCB-related issues.
Besides, it might be worth comparing the results with iperf when UART is used for console with the results when USB is used for console. The description of CONFIG_ESP_PHY_ENABLE_USB option says:
On some ESP targets, the USB PHY can interfere with WiFi thus lowering WiFi performance. As a result, on those affected ESP targets, the ESP PHY library's initialization will automatically disable the USB PHY to get best WiFi performance.
Since USB_SERIAL_JTAG requires USB PHY to be enabled, it sounds like this might lower WiFi performance. I am not sure if the iperf log posted by @hansw123 in https://github.com/espressif/esp-idf/issues/14171#issuecomment-2330957321 is already with USB_SERIAL_JTAG console enabled, or with console over UART.
If there would be issues with the analogue path, I would find evidence in the statistics from my (enterprise/managed) access points. And I can't find any. There is a connection at the highest speed 802.11n can achieve (65 Mbps) and it remains that way. Looks to me like we can rule any signal issue out.
In the meantime I do have another board (LilyGO T7 S3) and I will try it there. The comparison isn't completely fair though as this one has the SPIRAM connected by 8 wire SPI v.s. 4 wire. But I already discovered SPI RAM speed doesn't really matter that much with this issue.
One of your fellow developers disclosed recently that the impact of using the USB PHY on the Wifi performance is really small, something along the lines of 1%. Besides that, with this kind of interference I'd expect, again, evidence from the access points.
It really looks as if something inside the Wifi handling, in the digital/software domain is handling something very slowly, for some reason, before the frames are handed to LWIP, so not something I can have any influence on.
@eriksl The iperf rate results I posted above were tested with USB enabled, so this proves that there is nothing wrong with our official development board and code, and also that USB affects wifi rates at around 1% based on our official development board. Unofficial development boards can not directly apply our test results.
Ok you have a point there.
Still I am very curious at what point the delay is appearing. I don't think ESP32 keeps much statistics there.
This issue has been closed but i can't spot the solution or reasoning for closing.
Can somebody explain the solution or the fix for the reported issue?