ZeroTierOne icon indicating copy to clipboard operation
ZeroTierOne copied to clipboard

ZT works fine apart from HTTP *uploads*

Open dch opened this issue 2 years ago • 12 comments

Please let us know

  • What you expect to be happening.

ZT HTTP POST/PUT uploads should Just Work like they always did on zerotier

  • What is actually happening?

HTTP PUT/POST hangs, AFAICT the data never makes it across the wire. Running tcpdump on the far HTTP server end shows no traffic.

Dropping MTU on zt interface to 1200 (!) makes it work.

  • Any steps to reproduce the error.

A bit difficult to document here specifically, but it's basically

curl http://6plane/ --data-binary

The issue is 100% reproducible, and it never finished PUT request until MTU changed.

  • Any relevant console output or screenshots.

no console data whatsoever, but tcpdump on local end shows this prior to hacking MTU.

image

  • What operating system and ZeroTier version. Please try the latest ZeroTier release.

  • (client) FreeBSD 14.0-CURRENT + (server) FreeBSD 13.1-RELEASE both amd64

  • zerotier 1.10.2 both ends

I see a number of MTU related patches recently, but these all appear to be linux-specific.

dch avatar Jan 30 '23 21:01 dch

Curious what the MTU is on your physical network(s) on both ends.

glimberg avatar Jan 30 '23 21:01 glimberg

Hello @dch. Thanks for reporting in. We've seen others claiming similar when setting the MTU lower.

The two recent MTU-related patches were for:

  • Overlay network MTU (e.g. on your tap device) not being set correctly on Linux
  • Physical MTU defined in code as 1432 causes some issues so it's now possible through an undocumented workaround using multipath to set it to an arbitrary value. There's an example in the commit message.

I'll be watching this ticket as I'm going to write some MTU guidance at some point.

joseph-henry avatar Jan 30 '23 22:01 joseph-henry

Which HTTP server is it?

laduke avatar Jan 30 '23 22:01 laduke

Apache CouchDB. But I suspect this will happen with anything. I don't think the data ever leaves the local buffers, so I should be able to reproduce with netcat.

dch avatar Jan 30 '23 22:01 dch

@joseph-henry what do you recommend for doing (P)MTU discovery, given its all UDP?

dch avatar Jan 30 '23 22:01 dch

wireshark over the lower (igb0) interface shows that curl with 1263 bytes is too much, and gets fragmented, but curl with 1262 bytes is ok:

20230130-231250

In this scenario I do see a reply back from the server (erlang mochiweb, "any of you quaids got a smint?" reference) but the HTTP transaction fails to complete.

the 1262 successful bytes is sending 1372 bytes over the wire.

physical mtu

PC (1400) <> router (netgraph) 1492 <> router pppoe <> 1500 out to ISP router :cloud_with_lightning_and_rain: <> server 1500

dch avatar Jan 30 '23 23:01 dch

adjusting ZT interface mtu = 1335 seems to be the magic number

dch avatar Jan 31 '23 10:01 dch

Thanks. Does the connection between PC and server happen to be relaying? I can reproduce the symptoms when I force it to udp relay: http/tcp not working and Lowering the mtu on the virtual interface mitigates it.

Will try again with a direct connection...

I've been using like this in local.conf, instead of using the firewall to force relaying. {"virtual": {"xxxx41d851": {"blacklist": ["xxx.xxx.19.156/32"]}}} . Either way.

laduke avatar Jan 31 '23 16:01 laduke

Because we suffer from double NAT between these systems & the internet (until I finish IPv6 everywhere), I have this, ports are hardwired and very reliable since this was put in place.

{
  "physical": {
    "100.64.0.0/16": {
      "blacklist": true
    },
    "10.0.0/8": {
      "blacklist": true
    },
    "127.0.0.0/8": {
      "blacklist": true
    }
  },
  "settings": {
    "primaryPort": 9994,
    "allowSecondaryPort": false,
    "portMappingEnabled": false,
    "allowTcpFallbackRelay": false,
    "defaultBondingPolicy": "custom-active-backup",
    "policies": {
      "custom-active-backup": {
        "basePolicy": "active-backup",
        "failoverInterval": 10000
      }
    }
  }
}

The bonding seems to help avoid the NATs deciding we're offline.

I tend to notice relaying, the performance is immediately unusable. Now, if I could run my own relays somehow?

dch avatar Jan 31 '23 21:01 dch

after some experimentation, lowest working mtu on local PC is 1335, which hits this 1372 MTU on the igb0 physical interface again.

On my router (before the ISP one) I can ping non-ZT to the server with max size 1464.

dch avatar Feb 01 '23 15:02 dch

Hey there. Not much interesting to report. I tried various mtu combinations and the same local.conf, but both devices are linux.

rPi (1400) <> router (lan) {1400,1492,1500} - router (wan) {1400,1492,1500} <> ⛈️ <> server 1500 (no NAT)

(zerotier interfaces at 2800)

unfortunately nothing weird happened yet. tried bigger pings and netcat and curl-ing files to netcat.

laduke avatar Feb 08 '23 18:02 laduke

given zt is all udp (unless relaying) how does it do mtu discovery? is there a certain type of icmp packet that we might be blocking, for example?

dch avatar Mar 14 '24 07:03 dch