RIOT icon indicating copy to clipboard operation
RIOT copied to clipboard

sys/net/application_layer/gcoap: fix Observe notifications correlation

Open fabian18 opened this issue 1 year ago • 4 comments

Contribution description

When a CONfirmable CoAP request with an Observe option is sent, the sending client has to store the CoAP token to match multiple responses from the server sending Observe notifications. The gcoap memo is not cleared when the 2.xx response for the client registration contains an Observe option.

https://github.com/RIOT-OS/RIOT/blob/3f41494e59dc915fe17f67ba3b40d0827b9b3a36/sys/net/application_layer/gcoap/gcoap.c#L521-L531

For a confirmable request with a token length >0, the problem happens right before, where the resend buffer is released by actually overwriting the first byte of the request header in pdu_buf with 0. The first byte contains the token length which is required to match in:

https://github.com/RIOT-OS/RIOT/blob/3f41494e59dc915fe17f67ba3b40d0827b9b3a36/sys/net/application_layer/gcoap/gcoap.c#L888-L914

There, the overwritten request header is accessed in gcoap_request_memo_get_hdr().

https://github.com/RIOT-OS/RIOT/blob/3f41494e59dc915fe17f67ba3b40d0827b9b3a36/sys/include/net/gcoap.h#L1182-L1190

Solution:

When a response is received, the retransmission buffer is released, but before the header is copied to hdr_buf in:

https://github.com/RIOT-OS/RIOT/blob/3f41494e59dc915fe17f67ba3b40d0827b9b3a36/sys/include/net/gcoap.h#L809-L833

Testing procedure

Internal project. CoAP Observe is not so well tested. I hope the explanation makes sense. In examples/gcoap/client.c:

https://github.com/RIOT-OS/RIOT/blob/3f41494e59dc915fe17f67ba3b40d0827b9b3a36/examples/gcoap/client.c#L397-L408

the notification is sent right after the registration was sent. I have not tried that.

Issues/PRs references

Depends on and includes #20711

fabian18 avatar May 22 '24 14:05 fabian18

Murdock results

:heavy_check_mark: PASSED

21a46dcf43725b618545e255eca27deb0aa79879 examples/cord_lc: adapt Makefile.ci

Success Failures Total Runtime
10193 0 10193 14m:07s

Artifacts

riot-ci avatar May 22 '24 15:05 riot-ci

Of course the Observe notification must be sent from the same address as well ... I am at it to add it here.

https://github.com/RIOT-OS/RIOT/blob/230df5fe3b2deb69353c8f09c86b2654bd1852c6/sys/net/application_layer/gcoap/gcoap.c#L1824-L1837

fabian18 avatar May 30 '24 16:05 fabian18

Depends on #20711 now due to aux tx local supply in _handle_req()

fabian18 avatar May 31 '24 10:05 fabian18

for better testing I added an /rtc resource to the gcoap example. Use our tapsetup.sh and observe the notifications in wireshark for two native instances.

fabian18 avatar Jun 05 '24 19:06 fabian18

what's the best way to test this?

I used to test with 2 instances of "examples/gcoap" and tapsetup, but now I cannot ping from tap0 to tap1 anymore - strange.

fabian18 avatar Aug 19 '24 19:08 fabian18

I used to test with 2 instances of "examples/gcoap" and tapsetup, but now I cannot ping from tap0 to tap1 anymore - strange.

Hmm... this article solved my issue with:

sudo sysctl -w net.bridge.bridge-nf-call-ip6tables=0

fabian18 avatar Aug 20 '24 08:08 fabian18

Thanks for testing. I will look into it. It could be that it is some expected limitation.

fabian18 avatar Aug 21 '24 08:08 fabian18

Only one observer per resource is supported:

 * A CoAP client may register for Observe notifications for any resource that
 * an application has registered with gcoap. An application does not need to
 * take any action to support Observe client registration. However, gcoap
 * limits registration for a given resource to a _single_ observer.

Because, given a request, identifying a resource we look for only one gcoap_observe_memo_t that has a pointer to that resource. See https://github.com/RIOT-OS/RIOT/blob/c2efa3ef257e19b78400dbb549aa9a082712306d/sys/net/application_layer/gcoap/gcoap.c#L670

fabian18 avatar Aug 21 '24 12:08 fabian18

Maybe there is a CoAP way to tell the client, that Observe cannot be honored for the reason that another client is observing the resource currently.

fabian18 avatar Aug 21 '24 12:08 fabian18

Please squash directly

benpicco avatar Aug 21 '24 15:08 benpicco

Maybe there is a CoAP way to tell the client, that Observe cannot be honored for the reason that another client is observing the resource currently.

From RFC7641

A server that is unable or unwilling to add a new entry to the list of observers of a resource MAY silently ignore the registration request and process the GET request as usual. The resulting response MUST NOT include an Observe Option, the absence of which signals to the client that it will not be notified of changes to the resource and, e.g., needs to poll the resource for its state instead.

fabian18 avatar Aug 21 '24 15:08 fabian18

Looks like Makefile.ci needs an update - make generate-Makefile.ci should do

benpicco avatar Aug 21 '24 21:08 benpicco

I executed make -C examples/gcoap generate-Makefile.ci but actually it only purged boards from the list because I guess the builds are failing in the Docker container due to permission issue: Example:

Launching build container using image "sha256:f5951bc41dfface6cac869181d703e62cbdd3b7976b0946130a38f2e658000b3".
docker run --rm --tty --user $(id -u) -v '/usr/share/zoneinfo/Europe/Berlin:/etc/localtime:ro' -v '/home/[email protected]/RIOT:/data/riotbuild/riotbase:delegated' -v '/home/[email protected]/.cargo/registry:/data/riotbuild/.cargo/registry:delegated' -v '/home/[email protected]/.cargo/git:/data/riotbuild/.cargo/git:delegated' -e 'RIOTBASE=/data/riotbuild/riotbase' -e 'CCACHE_BASEDIR=/data/riotbuild/riotbase' -e 'BUILD_DIR=/data/riotbuild/riotbase/build' -e 'BUILD_IN_DOCKER=/data/riotbuild/riotbase/examples/gcoap/1' -e 'RIOTPROJECT=/data/riotbuild/riotbase' -e 'RIOTCPU=/data/riotbuild/riotbase/cpu' -e 'RIOTBOARD=/data/riotbuild/riotbase/boards' -e 'RIOTMAKE=/data/riotbuild/riotbase/makefiles'      -e 'BOARD=stm32f469i-disco' -e 'RIOT_CI_BUILD=1' -e 'BOARD=stm32f469i-disco' -e 'RIOT_CI_BUILD=1' -e 'DISABLE_MODULE=' -e 'DEFAULT_MODULE=' -e 'FEATURES_REQUIRED=' -e 'FEATURES_BLACKLIST=' -e 'FEATURES_OPTIONAL=periph_rtc' -e 'USEMODULE=auto_init_gnrc_netif fmt gcoap gnrc_icmpv6_echo gnrc_ipv6_default netdev_default netutils od ps random shell shell_cmds_default uri_parser' -e 'USEPKG=' -e 'DISABLE_MODULE=' -e 'DEFAULT_MODULE=' -e 'FEATURES_REQUIRED=' -e 'FEATURES_BLACKLIST=' -e 'FEATURES_OPTIONAL=periph_rtc' -e 'USEMODULE=auto_init_gnrc_netif fmt gcoap gnrc_icmpv6_echo gnrc_ipv6_default netdev_default netutils od ps random shell shell_cmds_default uri_parser' -e 'USEPKG='  -w '/data/riotbuild/riotbase/examples/gcoap/' 'sha256:f5951bc41dfface6cac869181d703e62cbdd3b7976b0946130a38f2e658000b3' make 'BOARD=stm32f469i-disco' 'BOARD=stm32f469i-disco'   -j16  all
Building application "gcoap_example" for "stm32f469i-disco" with CPU "stm32".

Traceback (most recent call last):
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 147, in <module>
    main(PARSER.parse_args())
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 139, in main
    generate_vectors(context)
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 132, in generate_vectors
    with open(dest_file, "w") as f_dest:
PermissionError: [Errno 13] Permission denied: '/data/riotbuild/riotbase/cpu/stm32/vectors/STM32F469xx.c'
make: *** [/data/riotbuild/riotbase/cpu/stm32/Makefile.include:109: /data/riotbuild/riotbase/cpu/stm32/vectors/STM32F469xx.c] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 155, in <module>
    main(PARSER.parse_args())
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 147, in main
    generate_irqs(context)
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 130, in generate_irqs
    with open(dest_file, "w") as f_dest:
PermissionError: [Errno 13] Permission denied: '/data/riotbuild/riotbase/cpu/stm32/include/irqs/f4/irqs.h'
make: *** [/data/riotbuild/riotbase/cpu/stm32/Makefile.include:123: /data/riotbuild/riotbase/cpu/stm32/include/irqs/f4/irqs.h] Error 1
make[1]: *** [/home/[email protected]/RIOT/makefiles/docker.inc.mk:391: ..in-docker-container] Error 2
make[1]: Leaving directory '/home/[email protected]/RIOT/examples/gcoap'
stm32f4discovery                        build failed

fabian18 avatar Aug 22 '24 15:08 fabian18

I executed make -C examples/gcoap generate-Makefile.ci but actually it only purged boards from the list because I guess the builds are failing in the Docker container due to permission issue: Example:

Launching build container using image "sha256:f5951bc41dfface6cac869181d703e62cbdd3b7976b0946130a38f2e658000b3".
docker run --rm --tty --user $(id -u) -v '/usr/share/zoneinfo/Europe/Berlin:/etc/localtime:ro' -v '/home/[email protected]/RIOT:/data/riotbuild/riotbase:delegated' -v '/home/[email protected]/.cargo/registry:/data/riotbuild/.cargo/registry:delegated' -v '/home/[email protected]/.cargo/git:/data/riotbuild/.cargo/git:delegated' -e 'RIOTBASE=/data/riotbuild/riotbase' -e 'CCACHE_BASEDIR=/data/riotbuild/riotbase' -e 'BUILD_DIR=/data/riotbuild/riotbase/build' -e 'BUILD_IN_DOCKER=/data/riotbuild/riotbase/examples/gcoap/1' -e 'RIOTPROJECT=/data/riotbuild/riotbase' -e 'RIOTCPU=/data/riotbuild/riotbase/cpu' -e 'RIOTBOARD=/data/riotbuild/riotbase/boards' -e 'RIOTMAKE=/data/riotbuild/riotbase/makefiles'      -e 'BOARD=stm32f469i-disco' -e 'RIOT_CI_BUILD=1' -e 'BOARD=stm32f469i-disco' -e 'RIOT_CI_BUILD=1' -e 'DISABLE_MODULE=' -e 'DEFAULT_MODULE=' -e 'FEATURES_REQUIRED=' -e 'FEATURES_BLACKLIST=' -e 'FEATURES_OPTIONAL=periph_rtc' -e 'USEMODULE=auto_init_gnrc_netif fmt gcoap gnrc_icmpv6_echo gnrc_ipv6_default netdev_default netutils od ps random shell shell_cmds_default uri_parser' -e 'USEPKG=' -e 'DISABLE_MODULE=' -e 'DEFAULT_MODULE=' -e 'FEATURES_REQUIRED=' -e 'FEATURES_BLACKLIST=' -e 'FEATURES_OPTIONAL=periph_rtc' -e 'USEMODULE=auto_init_gnrc_netif fmt gcoap gnrc_icmpv6_echo gnrc_ipv6_default netdev_default netutils od ps random shell shell_cmds_default uri_parser' -e 'USEPKG='  -w '/data/riotbuild/riotbase/examples/gcoap/' 'sha256:f5951bc41dfface6cac869181d703e62cbdd3b7976b0946130a38f2e658000b3' make 'BOARD=stm32f469i-disco' 'BOARD=stm32f469i-disco'   -j16  all
Building application "gcoap_example" for "stm32f469i-disco" with CPU "stm32".

Traceback (most recent call last):
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 147, in <module>
    main(PARSER.parse_args())
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 139, in main
    generate_vectors(context)
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_vectors.py", line 132, in generate_vectors
    with open(dest_file, "w") as f_dest:
PermissionError: [Errno 13] Permission denied: '/data/riotbuild/riotbase/cpu/stm32/vectors/STM32F469xx.c'
make: *** [/data/riotbuild/riotbase/cpu/stm32/Makefile.include:109: /data/riotbuild/riotbase/cpu/stm32/vectors/STM32F469xx.c] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 155, in <module>
    main(PARSER.parse_args())
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 147, in main
    generate_irqs(context)
  File "/data/riotbuild/riotbase/cpu/stm32/dist/irqs/gen_irqs.py", line 130, in generate_irqs
    with open(dest_file, "w") as f_dest:
PermissionError: [Errno 13] Permission denied: '/data/riotbuild/riotbase/cpu/stm32/include/irqs/f4/irqs.h'
make: *** [/data/riotbuild/riotbase/cpu/stm32/Makefile.include:123: /data/riotbuild/riotbase/cpu/stm32/include/irqs/f4/irqs.h] Error 1
make[1]: *** [/home/[email protected]/RIOT/makefiles/docker.inc.mk:391: ..in-docker-container] Error 2
make[1]: Leaving directory '/home/[email protected]/RIOT/examples/gcoap'
stm32f4discovery                        build failed

Had to delete cpu/stm32/vectors/* and cpu/stm32/include/irqs/*

fabian18 avatar Aug 25 '24 17:08 fabian18