Remove MRU limit on PPPoE
Fixes #331
I was intending to look at this at some point. Can we push this further? Why limit to 1540? Why limit at all? Shouldn't we get the interface MTU and calculate from there rather?
We have some interfaces up to 9k MTU, and there is no reason why we would not want to be able to run ppp MTU's up to that size. Heck, if a 32000 MTU was possible, why would we not want to be able to do that?
I'm not objecting to this going as is as an interim, but if we're going to push this up further we should go for gold if possible.
I was intending to look at this at some point. Can we push this further? Why limit to 1540?
My idea was to limit to 1540 to gather another reports with concrete use-cases for higher MRU.
Why limit at all? Shouldn't we get the interface MTU and calculate from there rather?
I am not really familiar with the codebase (new here), but having a concrete limit of MRU allows for stack allocation, right? I thought that we want to keep that property.
I think that would mostly be used by a userspace implementation where encap-decap is not done by the kernel. Not sure if that's even still used anywhere - specifically note that the only struct that uses that value is in pppoe.h the PPPoEPacketStruct (or just PPPoEPacket after the typedef).
This is used in common.c parsePacket - which will be used for discovery, which will NEVER be anywhere NEAR that big.
There are two possible mechanisms of use here.
- Static allocation.
- Dynamic allocation.
For static is a bit trickier, since we'd need to convert that, this would mostly be allocations on the stack, so we can use alloca, eg change:
PPPoEPacket foo
to:
PPPoEPacket *foo = alloca(sizeof(PPPoEPacket));
Now everything is dynamic, and what we really want is for that sizeof to be based on the actual negotiated MRU, which means if we change the last element in PPPoEPacketStruct to unsigned payload[0]; and if we know the actual size of the payload or the maximum MRU size, we can set size in either alloca or malloc to sizeof(PPPoEPacket) + payload_size for the allocation. Noth length already contains that size.
This will eliminate the single biggest use of ETH_JUMBO_LEN, but there may be other requirements in the process too.
I think simply changing this max doesn't make all required changes ... unfortunately. We'd need to work through all uses of PPPoEPacket and update all of them, as well as any other uses of the ETH_JUMBO_LEN define, and derivatives (MAX_PPPOE_PAYLOAD, which is primarily used in CHECK_ROOM). So this project seems to open a can of worms. Which is why I don't think anyone has started down the path yet.
My current understanding is that the pppd only does the discovery and then it lets the kernel do the rest after that. Doesn't that mean that we can keep the 1500B, (or even lower it to 1492B) for all the packets we send for discovery, but just allow negotiating higher MRU, that then the kernel would start using? Discovery would still be limited to 1500 (it seems fine).
I believe the Control Protocols remain userspace, so the question really is at what point does the pppoe encap/decap get handed off to the kernel?
So can we at a later point in time after starting off with a max(1500, interface mtu - 8) (-8 IIRC) and then during LCP make that bigger based on the configured mru/mtu and detected values? Ie, allow the pppoe code to set the mtu/mru to interface_mtu-8 when it starts up, on condition that no explicit value has been configured by the system administrator, or to lower the configured value? Eg, if the admin configures 3000 but the interface MTU is only 2000, then we should lower the values to 1992 right?
Eg, if the admin configures 3000 but the interface MTU is only 2000, then we should lower the values to 1992 right?
I believe that this is already done, here:
https://github.com/ppp-project/ppp/blob/13291f4c08ebe23afc5427c0e66f8ebb56305fbe/pppd/plugins/pppoe/plugin.c#L177-L180
and then during LCP make that bigger based on the configured mru/mtu and detected values?
Yeah, but LCP seems to be handled by the pppd, not by the pppoe plugin, so maybe it is already handled properly?
I'll see if I can get a test environment set up, need to test the pppoe-ia code for pppoe-server (rp-pppoe) as well. Might just as well do both at the same time.
So... (i think) that maybe only changing (removing) this:
https://github.com/ppp-project/ppp/blob/13291f4c08ebe23afc5427c0e66f8ebb56305fbe/pppd/plugins/pppoe/plugin.c#L456-L459
would solve the issue?
Tested the above patch https://github.com/ppp-project/ppp/pull/573#issuecomment-3474034514 And it also seems to work:
[mateusz@arch ~ ]$ ping -M do -s 1512 X.X.X.X%pppoe-wan
PING X.X.X.X (83.1.5.152) 1512(1540) bytes of data.
1520 bytes from X.X.X.X: icmp_seq=1 ttl=255 time=4.55 ms
1520 bytes from X.X.X.X: icmp_seq=2 ttl=255 time=3.45 ms
1520 bytes from X.X.X.X: icmp_seq=3 ttl=255 time=3.88 ms
PPPoE plugin from pppd 2.5.3-dev
Send PPPOE Discovery V1T1 PADI session 0x0 length 18
dst ff:ff:ff:ff:ff:ff src XX:XX:XX:XX:XX:XX
[service-name] [host-uniq c4 54 00 00] [PPP-max-payload 06 04]
Recv PPPOE Discovery V1T1 PADO session 0x0 length 55
dst YY:YY:YY:YY:YY:YY src XX:XX:XX:XX:XX:XX
[AC-name bng2_re0] [host-uniq c4 54 00 00] [PPP-max-payload 06 04] [service-name] [AC-cookie ba e8 3f 7b 35 a8 0d 7c 41 19 3a dc 20 2c 03 03]
Max-Payload: 1540
Send PPPOE Discovery V1T1 PADR session 0x0 length 38
dst XX:XX:XX:XX:XX:XX src YY:YY:YY:YY:YY:YY
[service-name] [host-uniq c4 54 00 00] [PPP-max-payload 06 04] [AC-cookie ba e8 3f 7b 35 a8 0d 7c 41 19 3a dc 20 2c 03 03]
Recv PPPOE Discovery V1T1 PADS session 0x810 length 55
dst YY:YY:YY:YY:YY:YY src XX:XX:XX:XX:XX:XX
[service-name] [host-uniq c4 54 00 00] [PPP-max-payload 06 04] [AC-name bng2_re0] [AC-cookie ba e8 3f 7b 35 a8 0d 7c 41 19 3a dc 20 2c 03 03]
PPP session is 2064
Connected to D4:99:6C:41:24:43 via interface enp0s31f6.35
using channel 48
Using interface pppoe-wan
Connect: pppoe-wan <--> enp0s31f6.35
@paulusmack: Have you seen this PR?
So... (i think) that maybe only changing (removing) this:
https://github.com/ppp-project/ppp/blob/13291f4c08ebe23afc5427c0e66f8ebb56305fbe/pppd/plugins/pppoe/plugin.c#L456-L459
would solve the issue?
I believe you're right, under the assumption that we're using kernel-mode helpers (I don't think userspace is supported any more at all, at least, pppoe-server side ditched that code a while back. I do not believe that any discovery frames at least can possibly exceed even 500 bytes (from what I recall of the protocol, but it's possible that some network uses multi-KB line tags along with pppoe-ia then tecnically it's possible I guess). In practise though I highly doubt we'll ever exceed even 100 byte frames during discovery (longest line tags I've seen in my environments are probably 20 characters long).
@mateusz834 would you mind updating the PR to removing those lines?
@paulusmack def not ready to merge yet, your insight would be appreciated.
@dfskoll if you don't mind jumping in here as well please to just confirm our reasoning?
Would be absolutely tremendous if we can get to at least 1500 inner MTU for most of our clients - inter-AS traffic is (and probably will be for a very long time still) be 1500 byte MTU, which would just mean less crap with routers that even in this day and age don't do proper MSS adjustments for TCP SYN frames, and perhaps slightly less fragmentation for the odd 1500 byte UDP frame.
@jkroonza updated the PR
@paulusmack, @enaess, @chipitsine, @jow-, @systemcrash: Have you seen this PPP PR?
It is linked, for example, to:
- PPP file: https://github.com/ppp-project/ppp/blob/master/pppd/plugins/pppoe/discovery.c#L723
- OpenWrt file: https://github.com/openwrt/openwrt/blob/main/package/network/services/ppp/files/ppp.sh#L243
- https://github.com/ppp-project/ppp/issues/331
- https://github.com/ppp-project/ppp/pull/573
- https://github.com/openwrt/openwrt/issues/5265
- https://github.com/openwrt/openwrt/issues/8190
- https://github.com/openwrt/openwrt/issues/14024
- https://github.com/openwrt/openwrt/pull/20606
- https://github.com/opnsense/core/issues/9398
MTU on wiki.wireshark.org:
- https://wiki.wireshark.org/MTU
RFCs MTU list:
RFC 1191: Path MTU discovery
- https://datatracker.ietf.org/doc/html/rfc1191
1500 Ethernet Networks RFC 894
1500 Point-to-Point (default) RFC 1134
1492 IEEE 802.3 RFC 1042
RFC 894: A Standard for the Transmission of IP Datagrams over Ethernet Networks
- https://datatracker.ietf.org/doc/html/rfc894
RFC 1042: Standard for the transmission of IP datagrams over IEEE 802 networks
- https://datatracker.ietf.org/doc/html/rfc1042
RFC 1134: Point-to-Point Protocol: A proposal for multi-protocol transmission of datagrams over Point-to-Point links
- https://datatracker.ietf.org/doc/html/rfc1134
RFC 1981: Path MTU Discovery for IP version 6
- https://datatracker.ietf.org/doc/html/rfc1981
RFC 2923: TCP Problems with Path MTU Discovery
- https://datatracker.ietf.org/doc/html/rfc2923
RFC 3988: Maximum Transmission Unit Signalling Extensions for the Label Distribution Protocol
- https://datatracker.ietf.org/doc/html/rfc3988
RFC 4459: MTU and Fragmentation Issues with In-the-Network Tunneling
- https://datatracker.ietf.org/doc/html/rfc4459
RFC 4638: Accommodating a Maximum Transit Unit/Maximum Receive Unit (MTU/MRU) Greater Than 1492 in the Point-to-Point Protocol over Ethernet (PPPoE)
- https://datatracker.ietf.org/doc/html/rfc4638
RFC 4821: Packetization Layer Path MTU Discovery
- https://datatracker.ietf.org/doc/html/rfc4821
RFC 7690: Close Encounters of the ICMP Type 2 Kind (Near Misses with ICMPv6 Packet Too Big (PTB))
- https://datatracker.ietf.org/doc/html/rfc7690
RFC 8201: Path MTU Discovery for IP version 6
- https://datatracker.ietf.org/doc/html/rfc8201
RFC 8899: Packetization Layer Path MTU Discovery for Datagram Transports
- https://datatracker.ietf.org/doc/html/rfc8899
RFC 9268: IPv6 Minimum Path MTU Hop-by-Hop Option
- https://datatracker.ietf.org/doc/html/rfc9268
initial commit message does not explain much
https://github.com/ppp-project/ppp/commit/43c9218605e0d7302cc614063de8e7078acabe46
That commit seems to be pre-rfc4638, so i guess the behavior back then was right, the commit basically allowed MRU < 1492.
That commit seems to be pre-rfc4638, so i guess the behavior back then was right, the commit basically allowed
MRU < 1492.
thus we can now drop that behaviour.
(or maybe add config option to keep pre-rfc4638, but I doubt it worth that)
I wonder if there's some implementation that count on MRU < 1492
I wonder if there's some implementation that count on MRU < 1492
I doubt, but even if there are such, then this is still is configurable.
for the sake of simplicity, I think your way is the best one (even it looks like a breaking change)
Just to feedback - we've been busy on our end with some pre-bleak-friday preparation and changes which kept us busy. We're discussing a test environment for this and other patches on pppoe-server too, which closely correlate.
I never included this kind of thing before because the original PPPoE RFC said that the MRU must be limited to 1492. Now that RFC 4638 gives us a way to lift that limit, we should implement that rather than just blindly removing the limit. In other words, I won't take this but I would take an implementation of RFC 4638.
Aren't we already supporting rfc 4638, but up to 1500 mru? This only removes that restriction and allows >1500 mru.
If RFC 4638 does not happen in the discovery, then we still limit the MRU to 1492, see: https://github.com/ppp-project/ppp/blob/1affa97637a405b1557b3e7f1c23239bfe65889d/pppd/plugins/pppoe/discovery.c#L722-L728
See fd1dcdf758418f040da3ed801ab001b5e46854e7
@mateusz834: What is needed to have a perfect PR to support RFC 4638, etc.?
Per my understanding, we already have a RFC 4638 compatible implementation, but limited to 1500 MRU. I think we only need to remove that 1500 MRU limit (this PR).