ot-br-posix icon indicating copy to clipboard operation
ot-br-posix copied to clipboard

OTBR 1.3 reference device neighbor issue

Open edmont opened this issue 2 years ago • 5 comments

I'm trying to validate the OTBR 1.3 reference device as Thread Certification compliant, but some Commercial tests are failing.

This branch https://github.com/openthread/ot-reference-release/compare/main...edmont:pr-add-ncs-1-3 has been used to build the reference device, with this command: REFERENCE_PLATFORM=ncs REFERENCE_RELEASE_TYPE=1.3 ./script/make-reference-release.bash

What I found to be the problem in the test C5.9.1 is that the host device seems to ignore Neighbor Advertisements sent by BR_1 on behalf of Router DUA registered address. This happens from frame 343 in the attached capture. Even when frame 348 seems to be a proper Neighbor Advertisement the Host device shows this in its neighbor table: fd00:7d03:7d03:7d03:7b27:8795:9d55:e9fd dev eth0 FAILED

C_5_9_1-fail.zip

@simonlingoogle any idea what could be going on? Might be something related to having BORDER_ROUTING enabled?

edmont avatar May 03 '22 14:05 edmont

BORDER_ROUTING feature creates an ICMPv6 socket to receive and process RS/RA messages. However, it does not receive or process NS/NA packets. And the Host should not be running BORDER_ROUTING, anyway.

Looking at packet Pkt 343, I think the reason why Host didn't receive this packet is because Pkt 343 is not using a correct destination MAC address. Host's MAC address is b8:27:eb:76:c1:bb, but Pkt 343 is sending to b8:27:eb:13:91:87. That's why the real b8:27:eb:13:91:87 device sent Pkt 345 to redirect Pkt 344.

b8:27:eb:13:91:87 seems to be coming out of no where. Is it another Host device (configured with 910b::1 address) that was mistakenly placed in the network?

Thoughts? @edmont

simonlingoogle avatar May 03 '22 15:05 simonlingoogle

Good catch! Thanks! I removed RADVD configuration from all 4 connected Raspbery Pis and the problem went away. Most likely this is due to some abrupt test finalization without leaving time for the script to properly stop RADVD.

However, there is a new issue: https://github.com/openthread/ot-br-posix/issues/1358

edmont avatar May 04 '22 06:05 edmont

should we create another ticket for the new ping issue?

for this host issue, we may still need to fix some thci scripts.

simonlingoogle avatar May 04 '22 07:05 simonlingoogle

_deviceBeforeReset should stop RADVD on Hosts.

It's also removing 910b::1

So, not sure what could be the issue.

simonlingoogle avatar May 05 '22 02:05 simonlingoogle

Yes, agree, not much to do with the THCI. Chances are that the device was not properly discovered by the Harness and reset was not applied to it, so RADVD was active in the test network.

edmont avatar May 05 '22 06:05 edmont

Closing stale issue.

jwhui avatar Aug 23 '22 04:08 jwhui