RIOT
RIOT copied to clipboard
gnrc_ipv6_nib: Force unspecified next hop addresses
Reserve unspecified address in NCEs for on-link prefixes, to have
_PFX_ON_LINK => ipv6_addr_is_unspecified(dst->next_hop)
be true
Contribution description
In nib_[onl|offl]_entry_t, a next_hop of NULL (as used by prefix list) shouldn't be interpreted as "don't care" but as an explicit "none, not applicable", because the next_hop value is used for association with a router at https://github.com/RIOT-OS/RIOT/blob/270aa7012fab0ceadeb0b82546ea06dbfe9c0902/sys/net/gnrc/network_layer/ipv6/nib/nib.c#L1469
The code intends to delete all off-link prefixes where next hop is the specified router [1], but (without this PR) it actually also deletes on-link prefixes on the same interface [2] because they falsely use the same next hop. (And are only distinguished as on-link through _PFX_ON_LINK
flag, which however is not checked for in this place.)
(Note that on-link prefixes are added again when processing a RA Prefix Information Option, which can be in the same Router Advertisement that caused _handle_rtr_timeout
and thereby their accidental deletion. This seems to be common and makes this deletion hard to notice.)
[1] which is the wrong thing to do in the first place, and was therefore removed in 96480008caaed8fa6af80a3afc1d6cb736287a42, thereby making this bug theoretical only [2] such as "the /128 prefix used for managing a temporary address" in #20369 (affected)
Testing procedure
Tests: #20372
Murdock results
:x: FAILED
3ab7896b1880343f979915c18a7f1698f537639b gnrc_ipv6_nib: Force unspecified next hop addresses
Success | Failures | Total | Runtime |
---|---|---|---|
10193 | 0 | 10195 | 14m:06s |
Artifacts
Now some tests are failing, is this intended?
Now some tests are failing, is this intended?
I would guess not. I can only tell so far that we fail because _nib_onl_alloc()
fails with next_hop == NULL
.
_nib-internal.c:509
dst->next_hop = _nib_onl_alloc(next_hop, iface);
if (dst->next_hop == NULL) {
memset(dst, 0, sizeof(_nib_offl_entry_t));
return NULL;
}
Maybe no neighbor cache entry should be allocated at all if no next hop address is known.
Neighbor Cache: A set of entries about individual neighbors to which traffic has been sent recently. Entries are keyed on the neighbor's on-link unicast IP address and contain such information as its link-layer address, a flag indicating whether the neighbor is a router or a host (called IsRouter in this document), a pointer to any queued packets waiting for address resolution to complete, etc. A Neighbor Cache entry also contains information used by the Neighbor Unreachability Detection algorithm, including the reachability state, the number of unanswered probes, and the time the next Neighbor Unreachability Detection event is scheduled to take place.
If there is no IPv6 address for a neighbor and neighbors are keyed by their address, why would we want to add a neighbor cache entry at all?
Now some tests are failing, is this intended?
No, but turns out they test in a wrong way. :D 🤯 Opened #20377
I would guess not. I can only tell so far that we fail because _nib_onl_alloc() fails with next_hop == NULL.
It feels like the test was working by coincidence.
We create one NCE for the default router from the RA.
Then we process the PIO and add a prefix.
For that we want to add a NCE with (next_hop = NULL
) and iface = 5
.
So for a prefix, the NCE only stores the interface on which the Prefix has been received.
The NCE is found because the NCE exists from the default router but it is only matched by the interface and just because address is NULL
returns true in _addr_equals
. There could have been any other Neighbor with interface 5 in the NCE, and I think this would create a wrong reference.
I think it should not be an NCE without an address just to store an interface number for a prefix.
Prefix List - A list of the prefixes that define a set of
addresses that are on-link. Prefix List entries
are created from information received in Router
Advertisements. Each entry has an associated
invalidation timer value (extracted from the
advertisement) used to expire prefixes when they
become invalid. A special "infinity" timer value
specifies that a prefix remains valid forever,
unless a new (finite) value is received in a
subsequent advertisement.
The link-local prefix is considered to be on the
prefix list with an infinite invalidation timer
regardless of whether routers are advertising a
prefix for it. Received Router Advertisements
SHOULD NOT modify the invalidation timer for the
link-local prefix.
Default Router List
- A list of routers to which packets may be sent.
Router list entries point to entries in the
Neighbor Cache; the algorithm for selecting a
default router favors routers known to be reachable
over those whose reachability is suspect. Each
entry also has an associated invalidation timer
value (extracted from Router Advertisements) used
to delete entries that are no longer advertised.
Conceptionally the DRL and PL are two different data structures. ~~However, we aggregate them together as offlink entries.~~ (sorry offlink is _PFX, _DC and _FT) For entries in the DRL it is explicitly stated that a DR references an NCE, but no word about it for prefixes.
If this causes a test to fail that was testing the wrong thing, you also need to add the fix to the test in this PR, otherwise there is no way to merge this.
Thanks for the heads up. The erroneous test (https://github.com/RIOT-OS/RIOT/issues/20377) is already fixed in master. Therefore I rebased.
Squashed your suggested change.
CI finds that tests/unittests
is now failing