esp-idf
esp-idf copied to clipboard
DNS client fails if two active netifs ? (IDFGH-12641)
Answers checklist.
- [X] I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
- [X] I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
- [X] I have searched the issue tracker for a similar issue and not found a similar issue.
IDF version.
5.2.1
Espressif SoC revision.
esp32s3 (lilygo t-sim7080g s3 board)
Operating System used.
Linux
How did you build your project?
Command line with idf.py
If you are using Windows, please specify command line type.
None
Development Kit.
lilygo t-sim7080g s3 board
Power Supply used.
USB
What is the expected behavior?
With two netifs (wifi and cell) the DNS client appears to fail. Works well without activating cell netif. This log shows three dns_recv events as it should.
victus: {412} fgrep -a dns_ typescript.04
D (00:00:26.241) lwip: dns_init: initializing
0x4204a83c: esp_netif_get_dns_info_api at /home/danny/src/github/esp32/esp-idf-v5.2.1/components/esp_netif/lwip/esp_netif_lwip.c:1948
D (01:00:31.882) esp_netif_lwip: esp_netif_get_dns_info: esp_netif=0x3fcb9854 type=0
0x4204a83c: esp_netif_get_dns_info_api at /home/danny/src/github/esp32/esp-idf-v5.2.1/components/esp_netif/lwip/esp_netif_lwip.c:1948
D (01:00:31.909) esp_netif_lwip: esp_netif_get_dns_info: esp_netif=0x3fcb9854 type=1
0x4204a83c: esp_netif_get_dns_info_api at /home/danny/src/github/esp32/esp-idf-v5.2.1/components/esp_netif/lwip/esp_netif_lwip.c:1948
D (01:00:31.936) esp_netif_lwip: esp_netif_get_dns_info: esp_netif=0x3fcb9854 type=2
D (01:00:32.556) lwip: dns_enqueue: "ipv4.cloudns.net": use DNS entry 0
D (01:00:32.585) lwip: dns_enqueue: "ipv4.cloudns.net": use DNS pcb 0
D (01:00:32.592) lwip: dns_send: dns_servers[0] "ipv4.cloudns.net": request
D (01:00:33.049) lwip: dns_recv: "ipv4.cloudns.net": response =
D (01:00:33.713) lwip: dns_tmr: dns_check_entries
D (01:00:34.760) lwip: dns_tmr: dns_check_entries
D (01:00:35.815) lwip: dns_tmr: dns_check_entries
D (01:00:36.234) lwip: dns_enqueue: "ntp.jimmobile.be": use DNS entry 1
D (01:00:36.257) lwip: dns_enqueue: "ntp.jimmobile.be": use DNS pcb 0
D (01:00:36.263) lwip: dns_send: dns_servers[0] "ntp.jimmobile.be": request
D (01:00:36.632) lwip: dns_recv: "ntp.jimmobile.be": response =
D (01:00:36.989) lwip: dns_tmr: dns_check_entries
D (01:00:37.993) lwip: dns_tmr: dns_check_entries
[...]
D (01:00:51.263) lwip: dns_tmr: dns_check_entries
D (01:00:51.795) lwip: dns_enqueue: "pool.ntp.org": use DNS entry 2
D (01:00:51.823) lwip: dns_enqueue: "pool.ntp.org": use DNS pcb 0
D (01:00:51.830) lwip: dns_send: dns_servers[0] "pool.ntp.org": request
D (01:00:52.401) lwip: dns_tmr: dns_check_entries
D (01:00:52.406) lwip: dns_send: dns_servers[0] "pool.ntp.org": request
D (01:00:52.758) lwip: dns_recv: "pool.ntp.org": response =
D (20:29:12.293) lwip: dns_tmr: dns_check_entries
D (20:29:13.288) lwip: dns_tmr: dns_check_entries
What is the actual behavior?
Different patterns for good/bad :
victus: {417} fgrep -a -e dns_recv -e "lwip: | 53 " typescript.03
D (19:53:51.659) lwip: | 53 | 60982 | (src port, dest port)
D (19:53:51.739) lwip: dns_recv: "ntp.jimmobile.be": response =
D (19:54:05.534) lwip: | 53 | 55132 | (src port, dest port)
D (19:54:05.887) lwip: | 53 | 55132 | (src port, dest port)
D (19:54:07.169) lwip: | 53 | 55132 | (src port, dest port)
D (19:54:07.405) lwip: | 53 | 47547 | (src port, dest port)
D (19:54:08.119) lwip: | 53 | 47547 | (src port, dest port)
D (19:54:09.362) lwip: | 53 | 55132 | (src port, dest port)
D (19:54:09.586) lwip: | 53 | 47547 | (src port, dest port)
D (19:54:11.498) lwip: | 53 | 47547 | (src port, dest port)
D (19:54:12.802) lwip: | 53 | 55132 | (src port, dest port)
D (19:54:13.744) lwip: | 53 | 55132 | (src port, dest port)
D (19:54:14.994) lwip: | 53 | 55132 | (src port, dest port)
D (19:54:15.218) lwip: | 53 | 47547 | (src port, dest port)
D (19:54:16.118) lwip: | 53 | 47547 | (src port, dest port)
D (19:54:17.352) lwip: | 53 | 55132 | (src port, dest port)
D (19:54:17.576) lwip: | 53 | 47547 | (src port, dest port)
D (19:54:19.816) lwip: | 53 | 47547 | (src port, dest port)
victus: {418} fgrep -a -e dns_recv -e "lwip: | 53 " typescript.04
D (01:00:32.969) lwip: | 53 | 57395 | (src port, dest port)
D (01:00:33.049) lwip: dns_recv: "ipv4.cloudns.net": response =
D (01:00:36.536) lwip: | 53 | 36505 | (src port, dest port)
D (01:00:36.632) lwip: dns_recv: "ntp.jimmobile.be": response =
D (01:00:52.646) lwip: | 53 | 19793 | (src port, dest port)
D (01:00:52.758) lwip: dns_recv: "pool.ntp.org": response =
D (01:00:53.215) lwip: | 53 | 19793 | (src port, dest port)
More detail from the failed run - a succeeded and a failed cal :
D (19:53:51.638) lwip: ip4_input: p->len 126 p->tot_len 126^M^M
D (19:53:51.644) lwip: udp_input: received datagram of length 106^M^M
D (19:53:51.650) lwip: UDP header:^M^M
D (19:53:51.654) lwip: +-------------------------------+^M^M
D (19:53:51.659) lwip: | 53 | 60982 | (src port, dest port)^M^M
D (19:53:51.667) lwip: +-------------------------------+^M^M
D (19:53:51.673) lwip: | 106 | 0x9bb4 | (len, chksum)^M^M
D (19:53:51.679) lwip: +-------------------------------+^M^M
D (19:53:51.685) lwip: udp (^[[0m^M^M
D (19:53:51.688) lwip: 192.168.0.203^[[0m^M^M
D (19:53:51.692) lwip: , 60982) <-- (^[[0m^M^M
D (19:53:51.695) lwip: 195.130.130.1^[[0m^M^M
D (19:53:51.699) lwip: , 53)^M^M
D (19:53:51.702) lwip: pcb (^[[0m^M^M
D (19:53:51.705) lwip: 0.0.0.0^[[0m^M^M
D (19:53:51.708) lwip: , 60982) <-- (^[[0m^M^M
D (19:53:51.712) lwip: 0.0.0.0^[[0m^M^M
D (19:53:51.715) lwip: , 0)^M^M
D (19:53:51.718) lwip: pcb (^[[0m^M^M
D (19:53:51.721) lwip: 0.0.0.0^[[0m^M^M
D (19:53:51.724) lwip: , 68) <-- (^[[0m^M^M
D (19:53:51.728) lwip: 0.0.0.0^[[0m^M^M
D (19:53:51.731) lwip: , 67)^M^M
D (19:53:51.734) lwip: udp_input: calculating checksum^M^M
D (19:53:51.739) lwip: dns_recv: "ntp.jimmobile.be": response = ^[[0m^M^M
D (19:53:51.745) lwip: 18.239.208.57^[[0m^M^M
D (19:53:51.749) lwip: ^M^M
[...]
D (19:54:05.513) lwip: ip4_input: p->len 94 p->tot_len 94^M^M
D (19:54:05.519) lwip: udp_input: received datagram of length 74^M^M
D (19:54:05.525) lwip: UDP header:^M^M
D (19:54:05.529) lwip: +-------------------------------+^M^M
D (19:54:05.534) lwip: | 53 | 55132 | (src port, dest port)^M^M
D (19:54:05.542) lwip: +-------------------------------+^M^M
D (19:54:05.547) lwip: | 74 | 0x18d7 | (len, chksum)^M^M
D (19:54:05.554) lwip: +-------------------------------+^M^M
D (19:54:05.559) lwip: udp (^[[0m^M^M
D (19:54:05.562) lwip: 192.168.0.203^[[0m^M^M
D (19:54:05.566) lwip: , 55132) <-- (^[[0m^M^M
D (19:54:05.570) lwip: 195.130.130.1^[[0m^M^M
D (19:54:05.573) lwip: , 53)^M^M
D (19:54:05.577) lwip: pcb (^[[0m^M^M
D (19:54:05.580) lwip: 0.0.0.0^[[0m^M^M
D (19:54:05.583) lwip: , 55132) <-- (^[[0m^M^M
D (19:54:05.586) lwip: 0.0.0.0^[[0m^M^M
D (19:54:05.590) lwip: , 0)^M^M
D (19:54:05.593) lwip: pcb (^[[0m^M^M
D (19:54:05.596) lwip: 0.0.0.0^[[0m^M^M
D (19:54:05.599) lwip: , 64662) <-- (^[[0m^M^M
D (19:54:05.602) lwip: 0.0.0.0^[[0m^M^M
D (19:54:05.606) lwip: , 0)^M^M
D (19:54:05.609) lwip: pcb (^[[0m^M^M
D (19:54:05.611) lwip: 0.0.0.0^[[0m^M^M
D (19:54:05.615) lwip: , 68) <-- (^[[0m^M^M
D (19:54:05.618) lwip: 0.0.0.0^[[0m^M^M
D (19:54:05.621) lwip: , 67)^M^M
D (19:54:05.624) lwip: udp_input: calculating checksum^M^M
D (19:54:05.630) lwip: dns_tmr: dns_check_entries^M^M
D (19:54:05.635) lwip: dns_send: dns_servers[0] "ipv4.cloudns.net": request^M^M
D (19:54:05.642) lwip: sending DNS request ID 4414 for name "ipv4.cloudns.net" to server 0^M^M^M
D (19:54:05.650) lwip: udp_send^M^M
D (19:54:05.654) lwip: udp_send: added header in given pbuf 0x3fcc38d8^M^M
Steps to reproduce.
Code at https://sourceforge.net/p/lilygo-t-sim- ... webserver/
Debug Logs.
No response
More Information.
Please tell me how to figure out what's wrong
Ok I think I found the reason for this problem but no solution yet.
The esp-netif layer gives the impression that DNS servers are specified per netif. (You call it with a netif as a parameter.) When querying the DNS servers after network setup, it's clear that each successful connection sets the DNS servers (that part is documented). Example :
I (20:06:27.660) Network: fixup_dns: default netif wifi
I (20:06:27.666) Network: List DNS servers
I (20:06:27.671) Network: IF 0 ppp dns 0 : 80.201.237.238
I (20:06:27.677) Network: IF 0 ppp dns 1 : 80.201.237.239
I (20:06:27.683) Network: IF 0 ppp dns 2 : 0.0.0.0
I (20:06:27.688) Network: IF 1 wifi dns 0 : 80.201.237.238
I (20:06:27.695) Network: IF 1 wifi dns 1 : 80.201.237.239
I (20:06:27.701) Network: IF 1 wifi dns 2 : 0.0.0.0
FYI my code currently sets up wifi first, then cell service, so it's the DNS servers of the mobile network you see. That's why my app wouldn't work : I'm on another network via WiFi so they don't respond.
The netif layer calls LWIP without the netif argument (see esp_netif_set_dns_info_api()) so that's where info gets lost.
Setting servers 0, 1, 2 for DNS doesn't reliably work either, see attempt results in the comments :
#if 1
// This works
uint32_t ip1 = esp_ip4addr_aton("195.130.130.1"); // asse.dnscache01.telenet-ops.be.
uint32_t ip2 = esp_ip4addr_aton("195.130.131.1"); // asse.dnscache02.telenet-ops.be.
uint32_t ip3 = esp_ip4addr_aton("80.201.237.238"); // something jimmobile.be
#endif
#if 0
// This fails
uint32_t ip1 = esp_ip4addr_aton("195.130.130.1"); // asse.dnscache01.telenet-ops.be.
uint32_t ip2 = esp_ip4addr_aton("80.201.237.238"); // something jimmobile.be
uint32_t ip3 = esp_ip4addr_aton("195.130.131.1"); // asse.dnscache02.telenet-ops.be.
#endif
#if 0
// This fails
uint32_t ip1 = esp_ip4addr_aton("195.130.130.1"); // asse.dnscache01.telenet-ops.be.
uint32_t ip2 = esp_ip4addr_aton("80.201.237.238"); // something jimmobile.be
uint32_t ip3 = esp_ip4addr_aton("216.239.32.10"); // ns.google.com
#endif
It's unclear to me why only one of these appears to work, and how to proceed. Should an application catch esp-netif availability and set DNS servers based on priority ? If yes then it would seem that the priorities in the netif layer are not useful/working. Help ;-)
https://github.com/espressif/esp-idf/issues/6270#issuecomment-745288299
This is a known limitation in IDF/lwip (also documented in: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/lwip.html#adapted-apis)
This will be handled on esp_netif layer in https://github.com/espressif/esp-idf/issues/6270 (closing this one as a duplicated issue)