go-libp2p Filter out undefined/reserved addresses

Motivated by https://github.com/ipfs/kubo/issues/10327#issuecomment-1929604946.

For some reason there seem to be nodes out there that will advertise IP addresses that are invalid. For example: ::5054:ff:fe92:8bc9 (from one of the logs in the linked issue) seems to be an invalid IP address in that:

It falls under ::/8, which was reserved by IETF https://www.iana.org/assignments/ipv6-address-space/ipv6-address-space.xml
It was not one of the carveouts under ::/8 that was allocated (i.e. see footnotes 1 through 6 in https://www.iana.org/assignments/ipv6-address-space/ipv6-address-space.xml or https://www.iana.org/assignments/iana-ipv6-special-registry/iana-ipv6-special-registry.xhtml)

Adding a filter for at least these unspecified ::/8 addresses (although we could do more) might look like blocking that entire range except for:

::1/128 (loopback)
64:ff9b:1/48 (ipv4-ipv6 translation - private ranges)
64:ff9b/96 where the ending IP is a public IPv4 (ipv4-ipv6 translation - public IPv4s only per https://www.rfc-editor.org/rfc/rfc6052.html#section-3.1)

Note: ::/128 (unspecified) should also be blocked, but we already do that.

It seems like the additional filters could go where we already do filtering https://github.com/libp2p/go-libp2p/blob/ae44ef95cf65be12f3a18a124bb3bb4d7975f0fb/p2p/net/swarm/swarm_dial.go#L476-L502

It's possible I'm misunderstanding how some of these ranges are meant to be used, and if so please let me know.

cc @sukunrt @Stebalien

Feb 06 '24 13:02 aschmahmann

Will try to include in v0.33 release

Feb 08 '24 16:02 p-shahi

It looks like this list is incomplete: https://github.com/multiformats/go-multiaddr/blob/master/net/private.go#L46

If we are discovering those addresses via identify, once we fix that list these should get filtered.

However, I think it's cheap enough to just skip these addresses when dialing.
We should add manet.IsUnroutableAddr(addr ma.Multiaddr) bool and use it in the dial path. https://github.com/multiformats/go-multiaddr/issues/234

Feb 09 '24 07:02 sukunrt

That may not be sufficient. We likely need to switch from blacklisting bad addresses to whitelisting good address ranges, at least for IPv6. E.g., everything in 0000::/8 is invalid except the cases discussed above.

Feb 09 '24 22:02 Stebalien

I agree. Whitelisting seems a lot simpler. I've raised https://github.com/multiformats/go-multiaddr/pull/235

I'm classifying NAT64 as public because as I understand, in a NAT64 system these addresses can reference public IP addresses. However I am not sure which layer does this translation actually happens. From https://github.com/libp2p/go-libp2p/issues/2349#issuecomment-1593720934

It is still based on NAT64 except now instead of relying on DNS64 rewriting, the host OS (computer, phone, ...) just knows it should rewrite it to an 64:ff9b address and just do it.

This sounds like the os should receive an IPv4 address and it'll handle converting it to a NAT64 address of the form 64:ff9b::1.2.3.4 in which case we do should just consider these addresses as invalid since the translation should be happening in the os. I am not sure about this though so I've opted to maintain current behavior.

Feb 12 '24 14:02 sukunrt

Note: The fix above relies on not adding invalid addresses to the peer store. We do this for identify and dht(https://github.com/libp2p/go-libp2p-kad-dht/blob/master/dual/dual.go#L111)

Feb 12 '24 14:02 sukunrt

I'm classifying NAT64 as public because as I understand, in a NAT64 system these addresses can reference public IP addresses.

NAT64 can only reference public addresses, yes. Any NAT64 address referencing a private address should be treated as invalid.

This sounds like the os should receive an IPv4 address and it'll handle converting it to a NAT64 address of the form 64:ff9b::1.2.3.4 in which case we do should just consider these addresses as invalid since the translation should be happening in the os. I am not sure about this though so I've opted to maintain current behavior.

I'm not so sure. The RFC claims that 64:ff9b:: was introduced to avoid OS-level rewriting. See https://www.rfc-editor.org/rfc/rfc6052.html#section-4.2

Basically:

ipv4-mapped prefix (::ffff:0:0/96) -> IPv6 clients need to talk to a IPv4 server. Translation within the client's OS is expected because IPv4 is the preferred protocol).
This one (64:ff9b::) -> IPv4 and IPv6 clients need to talk to an IPv6 server. The IPv6 client is assigned a 64:f9b:: address so that the IPv4 client can "downgrade" the address to IPv4, but the IPv6-capable client should use the IPv6 address even if they support IPv4, to avoid translation.

At least that's my reading.

Feb 12 '24 16:02 Stebalien

I agree with your reasoning.

I think you meant:

This one (64:ff9b::) -> IPv4 and IPv6 clients need to talk to an IPv6 server. The ~IPv6~ IPv4 client is assigned a 64:f9b:: address so that the IPv4 client ...

Feb 12 '24 18:02 sukunrt

I think you meant:

Ah, sorry, I meant.

IPv6 server is assigned a 64:f9b:: address so that the IPv4 client

Feb 12 '24 19:02 Stebalien

Basically, this is about assigning IPv4-compatible IPv6 addresses so IPv4-only clients can talk to IPv6 servers.

Feb 12 '24 19:02 Stebalien

NAT64 can only reference public addresses, yes. Any NAT64 address referencing a private address should be treated as invalid.

Yep from RFC 6052 section 3.1

The Well-Known Prefix MUST NOT be used to represent non-global IPv4 addresses, such as those defined in [RFC1918] or listed in Section 3 of [RFC5735].

Basically:

ipv4-mapped prefix (::ffff:0:0/96) -> IPv6 clients need to talk to a IPv4 server. Translation within the client's OS is expected because IPv4 is the preferred protocol).

This one (64:ff9b::) -> IPv4 and IPv6 clients need to talk to an IPv6 server. The IPv6 client is assigned a 64:f9b:: address so that the IPv4 client can "downgrade" the address to IPv4, but the IPv6-capable client should use the IPv6 address even if they support IPv4, to avoid translation.

At least that's my reading.

My reading is a bit different.

The IPv4-mapped prefix, ::ffff:0:0/96, wasn't chosen because existing nodes (Windows and Mac OS) would only send IPv4 packets, no IPv6 packets.
The Well Known prefix 64:ff9b::/96 is to allow IPv6 clients to talk to an IPv4 server. The IPv6 client sends a packet to a NAT64 server using the well known IPv6 prefix. The NAT64 server then translates (NAT) the packet to the destination IPv4 server. Here's a visual from RFC 6146:


             +---------------------+         +---------------+
             |IPv6 network         |         |    IPv4       |
             |           |  +-------------+  |  network      |
             |           |--| Name server |--|               |
             |           |  | with DNS64  |  |  +----+       |
             |  +----+   |  +-------------+  |  | H2 |       |
             |  | H1 |---|         |         |  +----+       |
             |  +----+   |      +-------+    |  192.0.2.1    |
             |2001:db8::1|------| NAT64 |----|               |
             |           |      +-------+    |               |
             |           |         |         |               |
             +---------------------+         +---------------+

https://datatracker.ietf.org/doc/html/rfc6146#section-1.2.2 contains a full walkthrough example that's helpful.

Mar 05 '24 23:03 MarcoPolo

The IPv4-mapped prefix, ::ffff:0:0/96, wasn't chosen because existing nodes (Windows and Mac OS) would only send IPv4 packets, no IPv6 packets.

There's an implicit "when IPv4 routes are available". My understanding is that this prefix exists to allow IPv6-only clients to reach IPv4-only nodes, so most operating systems prefer IPv4 to avoid this extra translation.

This RFC is trying to deal with the case where IPv6 is actually preferable.

Mar 06 '24 00:03 Stebalien

Nevermind..., you're right. IPv4-mapped addresses exist solely so that software can drop support for IPv4 addresses.

https://www.rfc-editor.org/rfc/rfc4038#section-4.2

Mar 06 '24 00:03 Stebalien

I was inferring something the RFC didn't state to try to figure out how IPv4-mapped addresses could possibly be useful on the internet.... and the answer is: they aren't.

Mar 06 '24 00:03 Stebalien

Closing this. Fixed in the latest go-multiaddr release. https://github.com/multiformats/go-multiaddr/releases/tag/v0.12.4

May 20 '24 15:05 sukunrt