headscale icon indicating copy to clipboard operation
headscale copied to clipboard

Subnet router ACL's broken on 0.23.0-alpha1

Open Sh4d opened this issue 1 year ago • 22 comments

Bug description

If you define an ACL on 0.23.0-alpha1, it breaks subnet routing. It works if you allow access to 0.0.0.0/0, but anything more specific just breaks traffic.

Environment

Ubuntu server with headscale 0.23.0-alpha1 on public IP space Ubuntu server with exit / subnet router on 1.52.1 on internal IP space Windows test machine on 1.52.1

To Reproduce

  1. Bring linux subnet router online with:
 tailscale up --advertise-routes=10.33.0.0/16 --login-server=https://XXXXX --hostname=Test-Exit --reset
  1. Enable the route on the headscale server + make sure ipforwarding is enabled on test-exit
  2. Configure this ACL:
{
  "groups": {
    "group:admins": ["user1"],
  },
  "acls": [
    { "action": "accept", "src": ["group:admins"], "dst": ["group:admins:*"] },
    { "action": "accept", "src": ["group:admins"], "dst": ["10.33.0.0/16:*"] },
    { "action": "accept", "src": ["group:admins"], "dst": ["0.0.0.0/0:*"] },
  ]
}
  1. Ping 10.33.10.10 from the windows test machine, confirm it works
  2. Comment out the allow 0.0.0.0/0, and kill -HUP headscaled
  3. See ping stops working
  4. Uncomment 0.0.0.0/0 again, kill -HUP, ping starts working

This exact same config works fine on 0.22.3.

Sh4d avatar Nov 15 '23 18:11 Sh4d

0.23.0-alpha2 addresses a series of issues with node synchronisation, online status and subnet routers, please test this release and report back if the issue still persist.

kradalby avatar Dec 10 '23 15:12 kradalby

Hello, seems 0.23.0-alpha2 still has this issue. I've just updated and reverted back to 0.22.3 mine deployment due inability to use subnet routers. On 0.22.3 all works as expected.

winterheart avatar Dec 13 '23 19:12 winterheart

I think this may be related to the ACLs - I have an alpha2 where subnet routes are working properly (albeit I have not tested with nodes that are exit nodes).

jwischka avatar Dec 19 '23 12:12 jwischka

@jwischka Can you share your ACL? I have the same problem, only ACL with "*:0" helps but its not what I want.

kfkawalec avatar Dec 19 '23 22:12 kfkawalec

Mine are basically a series of:

{
  "acls": [
    {
      "action": "accept",
      "src": ["user1"],
      "dst": ["*:*"],
    },
    {
      "action": "accept",
      "src": ["user2"],
      "dst": ["user2:*"],
    },
    {
      "action": "accept",
      "src": ["user3"],
      "dst": ["user2:*"],
    },
  ],
  "disableIPv4": false,
  "randomizeClientPort": true,
}

jwischka avatar Dec 20 '23 01:12 jwischka

In my case:

  • if user1 is the router, then user2 and user3 can connect through the router
  • if user2 is the router, user1 can connet (:) but user3 cannot

kfkawalec avatar Dec 20 '23 06:12 kfkawalec

@Sh4d @kfkawalec

Could you please get me a copy of the netmap from the nodes from both version 0.23.0-alpha2 and 0.22.3 so I can compare them?

You can do that with tailscale debug netmap > netmap.json for each node. I recon the most interesting node is the one that can no longer ping.

kradalby avatar Jan 03 '24 08:01 kradalby

I think I have fixed this in https://github.com/juanfont/headscale/pull/1673, could you try that PR?

kradalby avatar Jan 03 '24 15:01 kradalby

At the moment I do not have a 0.23.0-alpha2 installation.

kfkawalec avatar Jan 05 '24 21:01 kfkawalec

Not sure if its entirely related, however I am seeing broken subnet routes when setting an ACL.

I have the following ACL:

"acls": [
            {
                "action": "accept",
                "src": [
                        "group:developers"
                ],
                "dst": [
                        "test-server"
                ]
            },
            {
                "action": "accept",
                "src": [
                        "group:developers"
                ],
                "dst": [
                        "group:developers:*"
                ]
            },
            {
                "action": "accept",
                "src": [
                    "*",
                    "group:internal-exitnode"
                ],
                "dst": [
                    "group:internal-exitnode:*"
                ]
            }
    ]

the test-server is defined as a host and is on the same subnet as a the "exit node" advertising the route for the subnet, it is enabled.

With this ACL config, i am not seeing the subnet route in the PacketFilterRules under DstPorts (still seeing the other 2 rules though).

When setting a wildcard dst for the developer group, i am able to ping everything on the subnet and can see the wildcard in DstPorts in the PacketFilterRules.

I have also tried @kradalby's #1673 with the same result.

Tried on MacOS, Windows and Linux clients with the exact same result, running a packet capture on the client for the tailscale interface shows the ICMP with no response found, running a wireshark sshdump on tailscale0 of the subnet route server, i see no ICMP packets

sniff122 avatar Jan 08 '24 11:01 sniff122

@sniff122 does it work with 0.22.3? I just want to understand if it is already broken or a regression.

kradalby avatar Jan 08 '24 11:01 kradalby

@kradalby I believe I was having the issue that was fixed by #1564 which is in 0.23.0-alpha2, i shall try 0.22.3 again

sniff122 avatar Jan 08 '24 11:01 sniff122

It doesnt look like i have a recent snapshot of my VM with 0.22.3 however with 0.23.0-alpha1 the issues are still present

sniff122 avatar Jan 08 '24 12:01 sniff122

ok, I it would be ideal to get that tested as it will indicate if it is a new issue or an already existing one.

If it is a new issue (since 0.22.3), it will block the new release, but if it is an existing issue, then we should create a separate issue and solve it after 0.23.0 goes out.

kradalby avatar Jan 09 '24 06:01 kradalby

I just tested and confirmed this issue still exists on headscale_0.23.0-alpha2_linux_amd64.deb

Sh4d avatar Jan 09 '24 17:01 Sh4d

Here's a (slightly scrubbed) netmap from the client while broken. I was using the exact ACL at the top of this issue with just the allow 0.0.0.0 removed.

netmap.txt

Sh4d avatar Jan 09 '24 17:01 Sh4d

ok, I it would be ideal to get that tested as it will indicate if it is a new issue or an already existing one.

If it is a new issue (since 0.22.3), it will block the new release, but if it is an existing issue, then we should create a separate issue and solve it after 0.23.0 goes out.

@kradalby

Just tested with 0.22.3, found an older backup and it ACLs appear to be working fine

sniff122 avatar Jan 15 '24 13:01 sniff122

Could you give 0.23.0-alpha3 a go and report back?

kradalby avatar Jan 18 '24 16:01 kradalby

My install is prod now unfortunately so I'll need to spin up a test environment. Hopefully can do that next week.

Sh4d avatar Jan 19 '24 16:01 Sh4d

I am having the same problem with 0.23.0-alpha3. Let me know if I can help debugging it in some way.

oneingan avatar Jan 23 '24 23:01 oneingan

Have checked into this, it seems the user connecting to the subnet needs 'access' to the device/user the subnet is on. Cannot ping when acl is

acls:
  - action: accept
    src: 
      - "group:admin"
    dst:
      - "10.0.0.0/16:*"

Can ping when acl is

acls:
 - action: accept
   src: 
     - "group:admin"
   dst:
     - "10.0.0.0/16:*"
     - "node:0"

TL;DR peers are not inferred by advertised routes Is this intended @kradalby ?

TotoTheDragon avatar Feb 08 '24 16:02 TotoTheDragon

Related netmap when not working

{
  "PacketFilter": [
    {
      "IPProto": [
        6,
        17,
        1,
        58
      ],
      "Srcs": [
        "100.64.0.1/32",
        "fd7a:115c:a1e0::1/128"
      ],
      "Dsts": [
        {
          "Net": "10.0.0.0/16",
          "Ports": {
            "First": 0,
            "Last": 65535
          }
        }
      ],
      "Caps": []
    }
  ],
  "PacketFilterRules": [
    {
      "SrcIPs": [
        "100.64.0.1/32",
        "fd7a:115c:a1e0::1/128"
      ],
      "DstPorts": [
        {
          "IP": "10.0.0.0/16",
          "Bits": null,
          "Ports": {
            "First": 0,
            "Last": 65535
          }
        }
      ]
    }
  ]
}

Related netmap when working

{
  "PacketFilter": [
    {
      "IPProto": [
        6,
        17,
        1,
        58
      ],
      "Srcs": [
        "100.64.0.1/32",
        "fd7a:115c:a1e0::1/128"
      ],
      "Dsts": [
        {
          "Net": "10.0.0.0/16",
          "Ports": {
            "First": 0,
            "Last": 65535
          }
        },
        {
          "Net": "100.64.0.2/32",
          "Ports": {
            "First": 0,
            "Last": 0
          }
        },
        {
          "Net": "fd7a:115c:a1e0::2/128",
          "Ports": {
            "First": 0,
            "Last": 0
          }
        }
      ],
      "Caps": []
    }
  ],
  "PacketFilterRules": [
    {
      "SrcIPs": [
        "100.64.0.1/32",
        "fd7a:115c:a1e0::1/128"
      ],
      "DstPorts": [
        {
          "IP": "10.0.0.0/16",
          "Bits": null,
          "Ports": {
            "First": 0,
            "Last": 65535
          }
        },
        {
          "IP": "100.64.0.2/32",
          "Bits": null,
          "Ports": {
            "First": 0,
            "Last": 0
          }
        },
        {
          "IP": "fd7a:115c:a1e0::2/128",
          "Bits": null,
          "Ports": {
            "First": 0,
            "Last": 0
          }
        }
      ]
    }
  ]
}

TotoTheDragon avatar Feb 08 '24 16:02 TotoTheDragon