nuttx icon indicating copy to clipboard operation
nuttx copied to clipboard

[HELP] `dhcpd_stop()` does not return if no packets were received (recv blocking behavior)

Open engdavidiogo opened this issue 8 months ago • 6 comments

Description

This issue documents the observed behavior of the dhcpd daemon (apps/netutils/dhcpd) when dhcpd_stop() is called after starting the daemon without any DHCP packets being received.

Observed Behavior

When starting the DHCP server using dhcpd_start() and stopping it with dhcpd_stop(), the dhcpd_stop() call blocks indefinitely if no DHCP packets were received during the server execution. This blocks further logic execution, including restarting the daemon.

Sequence:

  1. Start DHCPD using dhcpd_start("wlan0") (or "eth0" on sim).
  2. No devices send DHCP requests.
  3. After 30 seconds, call dhcpd_stop().
  4. dhcpd_stop() sends the configured signal (default NETUTILS_DHCPD_SIGWAKEUP) using kill().
  5. The daemon remains stuck inside the recv() call.
  6. The semaphore used to synchronize (ds_sync) is never released.
  7. The task state remains as Stopped, but the stack is not reclaimed.

Platforms Tested

  • esp32c6-devkitc:wifi (interface: wlan0)
  • sim:tcpblaster (interface: eth0)

dhcpd daemon state

Example output from ps after calling dhcpd_stop():

PID GROUP PRI POLICY   TYPE    NPX STATE    EVENT     SIGMASK            STACK COMMAND
...
6     6 100 RR       Task      - Stopped            0000000000400000 0004040 DHCPD_daemon wlan0
...

Despite being marked as "Stopped", the task is never released due to blocking on recv().

Impact

This behavior prevents:

  • Reusing the DHCP daemon in long-running applications.
  • Performing controlled shutdowns or restarts.
  • Completing the application logic after dhcpd_stop().

Questions

  1. Is this behavior known to the NuttX community?

  2. Is there any official recommendation on how to handle the graceful termination of dhcpd, considering that it blocks on recv()?

  3. Is there a standard in NuttX for handling blocking recv() in daemons, allowing the task to be terminated safely via signal or other mechanism?

Verification

  • [x] I have verified before submitting the report.

engdavidiogo avatar Apr 04 '25 15:04 engdavidiogo

@engdavidiogo could you please share your .config ?

@xiaoxiang781216 @lupyuen did you already see this behavior before?

acassis avatar Apr 08 '25 13:04 acassis

Sure @acassis I will attach my defconfig from esp32c6-devkitc:wifi

CONFIG_ALLOW_BSD_COMPONENTS=y
CONFIG_ARCH="risc-v"
CONFIG_ARCH_BOARD="esp32c6-devkitc"
CONFIG_ARCH_BOARD_COMMON=y
CONFIG_ARCH_BOARD_ESP32C6_DEVKITC=y
CONFIG_ARCH_CHIP="esp32c6"
CONFIG_ARCH_CHIP_ESP32C6=y
CONFIG_ARCH_CHIP_ESP32C6WROOM1=y
CONFIG_ARCH_INTERRUPTSTACK=2048
CONFIG_ARCH_RISCV=y
CONFIG_ARCH_STACKDUMP=y
CONFIG_BOARD_LOOPSPERMSEC=15000
CONFIG_BUILTIN=y
CONFIG_DEV_ZERO=y
CONFIG_DRIVERS_IEEE80211=y
CONFIG_DRIVERS_WIRELESS=y
CONFIG_ESPRESSIF_ESP32C6=y
CONFIG_ESPRESSIF_SPIFLASH=y
CONFIG_ESPRESSIF_SPIFLASH_SPIFFS=y
CONFIG_ESPRESSIF_WIFI=y
CONFIG_EXAMPLES_HELLO=y
CONFIG_EXAMPLES_HELLO_STACKSIZE=10240
CONFIG_EXAMPLES_RANDOM=y
CONFIG_FS_PROCFS=y
CONFIG_IDLETHREAD_STACKSIZE=2048
CONFIG_INIT_ENTRYPOINT="nsh_main"
CONFIG_INIT_STACKSIZE=8192
CONFIG_INTELHEX_BINARY=y
CONFIG_IOB_THROTTLE=24
CONFIG_LIBC_PERROR_STDOUT=y
CONFIG_LIBC_STRERROR=y
CONFIG_NETDB_DNSCLIENT=y
CONFIG_NETDEV_LATEINIT=y
CONFIG_NETDEV_PHY_IOCTL=y
CONFIG_NETDEV_WIRELESS_IOCTL=y
CONFIG_NETDEV_WORK_THREAD=y
CONFIG_NETINIT_DHCPC=y
CONFIG_NETUTILS_CJSON=y
CONFIG_NETUTILS_DHCPD=y
CONFIG_NETUTILS_DHCPD_STACKSIZE=4096
CONFIG_NET_BROADCAST=y
CONFIG_NET_ICMP_SOCKET=y
CONFIG_NET_TCP=y
CONFIG_NET_TCPBACKLOG=y
CONFIG_NET_TCP_DELAYED_ACK=y
CONFIG_NET_TCP_WRITE_BUFFERS=y
CONFIG_NET_UDP=y
CONFIG_NFILE_DESCRIPTORS_PER_BLOCK=6
CONFIG_NSH_ARCHINIT=y
CONFIG_NSH_BUILTIN_APPS=y
CONFIG_NSH_FILEIOSIZE=512
CONFIG_NSH_READLINE=y
CONFIG_NSH_STRERROR=y
CONFIG_PREALLOC_TIMERS=0
CONFIG_PTHREAD_MUTEX_TYPES=y
CONFIG_RR_INTERVAL=200
CONFIG_SCHED_BACKTRACE=y
CONFIG_SCHED_LPWORK=y
CONFIG_SCHED_WAITPID=y
CONFIG_SIG_DEFAULT=y
CONFIG_START_DAY=29
CONFIG_START_MONTH=11
CONFIG_START_YEAR=2019
CONFIG_SYSTEM_DHCPC_RENEW=y
CONFIG_SYSTEM_DUMPSTACK=y
CONFIG_SYSTEM_NSH=y
CONFIG_SYSTEM_PING=y
CONFIG_TESTING_GETPRIME=y
CONFIG_TESTING_OSTEST=y
CONFIG_TLS_TASK_NELEM=4
CONFIG_UART0_SERIAL_CONSOLE=y
CONFIG_WIRELESS=y
CONFIG_WIRELESS_WAPI=y
CONFIG_WIRELESS_WAPI_CMDTOOL=y
CONFIG_WIRELESS_WAPI_INITCONF=y

engdavidiogo avatar Apr 08 '25 14:04 engdavidiogo

@engdavidiogo I don't see anything wrong on this config.

@wengzhe since you have more experience about network, could you please give some advice here?

acassis avatar Apr 10 '25 12:04 acassis

Hi @wengzhe, do we have any information about this?

engdavidiogo avatar May 15 '25 13:05 engdavidiogo

I see this too.

How about adding SO_RCVTIMEO using a Kconfig-set timeout (if 0 we don't use a timeout). That will allow the recv to timeout and allow the while loop to exit when DHCPD_STOP_REQUESTED is true?

Is that an appropriate solution?

TimJTi avatar May 18 '25 12:05 TimJTi

OK - I'm NOT seeing this. I do have an issue with my USB CDC-NCM after running dhcpd_start, but the actual daemon is behaving correctly for me, starting/stopping regardless of whether any packets are received.

This might be wifi or arch specific for you so I graciously bow out.

TimJTi avatar May 19 '25 11:05 TimJTi

@TimJTi Hey, I have a question. Maybe this is the case to add a SO_LINGER option ? Because wifi/arch probaly is queuing some data. Do you have any tip to debug this behaviour ?

thiagofinelon avatar Aug 25 '25 14:08 thiagofinelon