nebula Run a lighthouse on k8s

I wonder if anyone ever managed to get a nebula lighthouse to work on a k8s pod and then expose it to the world via service.

This is my setup:

Dockerfile

# syntax = docker/dockerfile:1.1-experimental
# NOTE: `sudo` don't work here!
FROM --platform=${BUILDPLATFORM} debian:buster as builder

ARG TARGETARCH
ARG TARGETOS
ARG NEBULA_VERSION=v1.4.0

WORKDIR /tmp

RUN apt-get update && \
  apt-get -y install wget && \
  NEBULA_ARCHIVE=nebula-$TARGETOS-$TARGETARCH.tar.gz && \
  wget https://github.com/slackhq/nebula/releases/download/$NEBULA_VERSION/$NEBULA_ARCHIVE && \
  tar -xvf $NEBULA_ARCHIVE && \
  rm -rf $NEBULA_ARCHIVE \
    ./nebula-cert \
    /var/lib/apt/lists/*

FROM debian:buster

WORKDIR /app

COPY --from=builder /tmp/nebula .
COPY configs/nebula/config.yaml .

CMD ["/app/nebula", "-config", "/app/config.yaml"]

config.yaml

# This is the nebula example configuration file. You must edit, at a minimum, the static_host_map, lighthouse, and firewall sections
# Some options in this file are HUPable, including the pki section. (A HUP will reload credentials from disk without affecting existing tunnels)

# PKI defines the location of credentials for this node. Each of these can also be inlined by using the yaml ": |" syntax.
pki:
  # The CAs that are accepted by this node. Must contain one or more certificates created by 'nebula-cert ca'
  ca: /app/certs/ca.crt
  cert: /app/certs/lighthouse.crt
  key: /app/certs/lighthouse.key
  #blocklist is a list of certificate fingerprints that we will refuse to talk to
  #blocklist:
  #  - c99d4e650533b92061b09918e838a5a0a6aaee21eed1d12fd937682865936c72

# The static host map defines a set of hosts with fixed IP addresses on the internet (or any network).
# A host can have multiple fixed IP addresses defined here, and nebula will try each when establishing a tunnel.
# The syntax is:
#   "{nebula ip}": ["{routable ip/dns name}:{routable port}"]
# Example, if your lighthouse has the nebula IP of 192.168.100.1 and has the real ip address of 100.64.22.11 and runs on port 4242:
# static_host_map:
#   "192.168.100.1": ["x.x.x.x:4242"]


lighthouse:
  # am_lighthouse is used to enable lighthouse functionality for a node. This should ONLY be true on nodes
  # you have configured to be lighthouses in your network
  am_lighthouse: true
  # serve_dns optionally starts a dns listener that responds to various queries and can even be
  # delegated to for resolution
  #serve_dns: false
  #dns:
    # The DNS host defines the IP to bind the dns listener to. This also allows binding to the nebula node IP.
    #host: 0.0.0.0
    #port: 53
  # interval is the number of seconds between updates from this node to a lighthouse.
  # during updates, a node sends information about its current IP addresses to each node.
  interval: 60
  # hosts is a list of lighthouse hosts this node should report to and query from
  # IMPORTANT: THIS SHOULD BE EMPTY ON LIGHTHOUSE NODES
  # IMPORTANT2: THIS SHOULD BE LIGHTHOUSES' NEBULA IPs, NOT LIGHTHOUSES' REAL ROUTABLE IPs
  # hosts:
  #   - "192.168.100.1"

  # remote_allow_list allows you to control ip ranges that this node will
  # consider when handshaking to another node. By default, any remote IPs are
  # allowed. You can provide CIDRs here with `true` to allow and `false` to
  # deny. The most specific CIDR rule applies to each remote. If all rules are
  # "allow", the default will be "deny", and vice-versa. If both "allow" and
  # "deny" rules are present, then you MUST set a rule for "0.0.0.0/0" as the
  # default.
  #remote_allow_list:
    # Example to block IPs from this subnet from being used for remote IPs.
    #"172.16.0.0/12": false

    # A more complicated example, allow public IPs but only private IPs from a specific subnet
    #"0.0.0.0/0": true
    #"10.0.0.0/8": false
    #"10.42.42.0/24": true

  # local_allow_list allows you to filter which local IP addresses we advertise
  # to the lighthouses. This uses the same logic as `remote_allow_list`, but
  # additionally, you can specify an `interfaces` map of regular expressions
  # to match against interface names. The regexp must match the entire name.
  # All interface rules must be either true or false (and the default will be
  # the inverse). CIDR rules are matched after interface name rules.
  # Default is all local IP addresses.
  #local_allow_list:
    # Example to block tun0 and all docker interfaces.
    #interfaces:
      #tun0: false
      #'docker.*': false
    # Example to only advertise this subnet to the lighthouse.
    #"10.0.0.0/8": true

# Port Nebula will be listening on. The default here is 4242. For a lighthouse node, the port should be defined,
# however using port 0 will dynamically assign a port and is recommended for roaming nodes.
listen:
  # To listen on both any ipv4 and ipv6 use "[::]"
  host: 0.0.0.0
  port: 42420
  # Sets the max number of packets to pull from the kernel for each syscall (under systems that support recvmmsg)
  # default is 64, does not support reload
  #batch: 64
  # Configure socket buffers for the udp side (outside), leave unset to use the system defaults. Values will be doubled by the kernel
  # Default is net.core.rmem_default and net.core.wmem_default (/proc/sys/net/core/rmem_default and /proc/sys/net/core/rmem_default)
  # Maximum is limited by memory in the system, SO_RCVBUFFORCE and SO_SNDBUFFORCE is used to avoid having to raise the system wide
  # max, net.core.rmem_max and net.core.wmem_max
  #read_buffer: 10485760
  #write_buffer: 10485760

# EXPERIMENTAL: This option is currently only supported on linux and may
# change in future minor releases.
#
# Routines is the number of thread pairs to run that consume from the tun and UDP queues.
# Currently, this defaults to 1 which means we have 1 tun queue reader and 1
# UDP queue reader. Setting this above one will set IFF_MULTI_QUEUE on the tun
# device and SO_REUSEPORT on the UDP socket to allow multiple queues.
#routines: 1

punchy:
  # Continues to punch inbound/outbound at a regular interval to avoid expiration of firewall nat mappings
  punch: true

  # respond means that a node you are trying to reach will connect back out to you if your hole punching fails
  # this is extremely useful if one node is behind a difficult nat, such as a symmetric NAT
  # Default is false
  #respond: true

  # delays a punch response for misbehaving NATs, default is 1 second, respond must be true to take effect
  #delay: 1s

# Cipher allows you to choose between the available ciphers for your network. Options are chachapoly or aes
# IMPORTANT: this value must be identical on ALL NODES/LIGHTHOUSES. We do not/will not support use of different ciphers simultaneously!
#cipher: chachapoly

# Local range is used to define a hint about the local network range, which speeds up discovering the fastest
# path to a network adjacent nebula node.
#local_range: "172.16.0.0/24"

# sshd can expose informational and administrative functions via ssh this is a
#sshd:
  # Toggles the feature
  #enabled: true
  # Host and port to listen on, port 22 is not allowed for your safety
  #listen: 127.0.0.1:2222
  # A file containing the ssh host private key to use
  # A decent way to generate one: ssh-keygen -t ed25519 -f ssh_host_ed25519_key -N "" < /dev/null
  #host_key: ./ssh_host_ed25519_key
  # A file containing a list of authorized public keys
  #authorized_users:
    #- user: steeeeve
      # keys can be an array of strings or single string
      #keys:
        #- "ssh public key string"

# Configure the private interface. Note: addr is baked into the nebula certificate
tun:
  # When tun is disabled, a lighthouse can be started without a local tun interface (and therefore without root)
  disabled: false
  # Name of the device
  dev: nebula1
  # Toggles forwarding of local broadcast packets, the address of which depends on the ip/mask encoded in pki.cert
  drop_local_broadcast: false
  # Toggles forwarding of multicast packets
  drop_multicast: false
  # Sets the transmit queue length, if you notice lots of transmit drops on the tun it may help to raise this number. Default is 500
  tx_queue: 500
  # Default MTU for every packet, safe setting is (and the default) 1300 for internet based traffic
  mtu: 1300
  # Route based MTU overrides, you have known vpn ip paths that can support larger MTUs you can increase/decrease them here
  routes:
    #- mtu: 8800
    #  route: 10.0.0.0/16
  # Unsafe routes allows you to route traffic over nebula to non-nebula nodes
  # Unsafe routes should be avoided unless you have hosts/services that cannot run nebula
  # NOTE: The nebula certificate of the "via" node *MUST* have the "route" defined as a subnet in its certificate
  unsafe_routes:
    #- route: 172.16.1.0/24
    #  via: 192.168.100.99
    #  mtu: 1300 #mtu will default to tun mtu if this option is not sepcified


# TODO
# Configure logging level
logging:
  # panic, fatal, error, warning, info, or debug. Default is info
  level: info
  # json or text formats currently available. Default is text
  format: json
  # Disable timestamp logging. useful when output is redirected to logging system that already adds timestamps. Default is false
  #disable_timestamp: true
  # timestamp format is specified in Go time format, see:
  #     https://golang.org/pkg/time/#pkg-constants
  # default when `format: json`: "2006-01-02T15:04:05Z07:00" (RFC3339)
  # default when `format: text`:
  #     when TTY attached: seconds since beginning of execution
  #     otherwise: "2006-01-02T15:04:05Z07:00" (RFC3339)
  # As an example, to log as RFC3339 with millisecond precision, set to:
  #timestamp_format: "2006-01-02T15:04:05.000Z07:00"

#stats:
  #type: graphite
  #prefix: nebula
  #protocol: tcp
  #host: 127.0.0.1:9999
  #interval: 10s

  #type: prometheus
  #listen: 127.0.0.1:8080
  #path: /metrics
  #namespace: prometheusns
  #subsystem: nebula
  #interval: 10s

  # enables counter metrics for meta packets
  #   e.g.: `messages.tx.handshake`
  # NOTE: `message.{tx,rx}.recv_error` is always emitted
  #message_metrics: false

  # enables detailed counter metrics for lighthouse packets
  #   e.g.: `lighthouse.rx.HostQuery`
  #lighthouse_metrics: false

# Handshake Manger Settings
#handshakes:
  # Handshakes are sent to all known addresses at each interval with a linear backoff,
  # Wait try_interval after the 1st attempt, 2 * try_interval after the 2nd, etc, until the handshake is older than timeout
  # A 100ms interval with the default 10 retries will give a handshake 5.5 seconds to resolve before timing out
  #try_interval: 100ms
  #retries: 20
  # trigger_buffer is the size of the buffer channel for quickly sending handshakes
  # after receiving the response for lighthouse queries
  #trigger_buffer: 64


# Nebula security group configuration
firewall:
  conntrack:
    tcp_timeout: 12m
    udp_timeout: 3m
    default_timeout: 10m
    max_connections: 100000

  # The firewall is default deny. There is no way to write a deny rule.
  # Rules are comprised of a protocol, port, and one or more of host, group, or CIDR
  # Logical evaluation is roughly: port AND proto AND (ca_sha OR ca_name) AND (host OR group OR groups OR cidr)
  # - port: Takes `0` or `any` as any, a single number `80`, a range `200-901`, or `fragment` to match second and further fragments of fragmented packets (since there is no port available).
  #   code: same as port but makes more sense when talking about ICMP, TODO: this is not currently implemented in a way that works, use `any`
  #   proto: `any`, `tcp`, `udp`, or `icmp`
  #   host: `any` or a literal hostname, ie `test-host`
  #   group: `any` or a literal group name, ie `default-group`
  #   groups: Same as group but accepts a list of values. Multiple values are AND'd together and a certificate would have to contain all groups to pass
  #   cidr: a CIDR, `0.0.0.0/0` is any.
  #   ca_name: An issuing CA name
  #   ca_sha: An issuing CA shasum

  outbound:
    # Allow all outbound traffic from this node
    - port: any
      proto: any
      host: any

  inbound:
    # Allow icmp between any nebula hosts
    - port: any
      proto: icmp
      host: any

k8s deployment and service

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nebula
  labels:
    app: nebula
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nebula
  template:
    metadata:
      labels:
        app: nebula
    spec:
      containers:
        - name: nebula
          image: myorg/nebula
          ports:
            - name: udp
              containerPort: 42420
          volumeMounts:
            - name: certs
              mountPath: "/app/certs"
              readOnly: true
            - mountPath: /dev/net/tun
              name: dev-tun
          securityContext:
            capabilities:
              add: ["NET_BROADCAST","NET_ADMIN", "NET_RAW"]
      initContainers:
        - name: install
          image: busybox
          command:
            - mkdir
            - /dev/net
            - mknod
            - /dev/net/tun
            - c
            - "10"
            - "200"
      volumes:
        - name: dev-tun
          hostPath:
            path: /dev/net/tun
            type: CharDevice
        - name: certs
          projected:
            sources:
              - secret:
                  name: nebca
              - secret:
                  name: nebcert
              - secret:
                  name: nebkey
      imagePullSecrets:
        - name: regcred

---
apiVersion: v1
kind: Service
metadata:
  name: nebula
  labels:
    app: nebula
spec:
  type: LoadBalancer
  selector:
    app: nebula
  ports:
    - name: udp
      port: 42420
      targetPort: 42420
      protocol: UDP

Logs from the container/pod

{"firewallRule":{"caName":"","caSha":"","direction":"outgoing","endPort":0,"groups":null,"host":"any","ip":"","proto":0,"startPort":0},"level":"info","msg":"Firewall rule added","time":"2021-07-21T04:13:20Z"}
{"firewallRule":{"caName":"","caSha":"","direction":"incoming","endPort":0,"groups":null,"host":"any","ip":"","proto":1,"startPort":0},"level":"info","msg":"Firewall rule added","time":"2021-07-21T04:13:20Z"}
{"firewallHash":"blabla","level":"info","msg":"Firewall started","time":"2021-07-21T04:13:20Z"}
{"level":"info","msg":"Main HostMap created","network":{"IP":"192.168.100.1","Mask":"////AA=="},"preferredRanges":null,"time":"2021-07-21T04:13:20Z"}
{"level":"info","msg":"UDP hole punching enabled","time":"2021-07-21T04:13:20Z"}
{"build":"1.4.0","interface":"nebula1","level":"info","msg":"Nebula interface is active","network":"192.168.100.1/24","time":"2021-07-21T04:13:20Z","udpAddr":{"ip":"0.0.0.0","port":42420}}

ip addr in the pod

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if53: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc noqueue state UP group default 
    link/ether 16:4e:51:2f:9c:36 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.64.51/24 brd 192.168.64.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::144e:51ff:fe2f:9c36/64 scope link 
       valid_lft forever preferred_lft forever
5: nebula1: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1300 qdisc pfifo_fast state UNKNOWN group default qlen 500
    link/none 
    inet 192.168.100.1/24 scope global nebula1
       valid_lft forever preferred_lft forever
    inet6 fe80::708:233e:d9b1:9d73/64 scope link stable-privacy 
       valid_lft forever preferred_lft forever

But I cannot seem to get it to work. If I ssh into the pod that's running nebula and I start another nebula instance on a different port and try to connect to 4242, it works. But accessing the service from outside doesn't seem to work.

Does anyone have any clue as to what's going on?

Jul 21 '21 04:07 rolandjitsu

Hmm, it seems like adding NET_BIND_SERVICE to capabilities does the trick. But, I can ping the lighthouse from either node, but I cannot ping between nodes 😕

Actually, I can ping between nodes that are running on Debian, but not I cannot ping between Debian <> macOS (the mac can only ping the lighthouse and nothing else).

Jul 21 '21 08:07 rolandjitsu

Is the macos box on the same physical network as the k8 instances? If not, do the k8 machines have access to the internet? If they do, have you tried enabling punchy.respond: true?

Nov 09 '21 20:11 nbrownus

Is the macos box on the same physical network as the k8 instances? If not, do the k8 machines have access to the internet? If they do, have you tried enabling punchy.respond: true?

The macOS is not on the same network (the k8s cluster is on GCP, the mac is my dev machine at home). Yes, the k8s do have access to the internet. I cannot recall if I tried that option, but I might have.

Nov 10 '21 04:11 rolandjitsu

~~Are your GCP firewall rules setup to allow the lighthouse to communicate w/ the world over udp/4242?~~ nvm, I didn't understand your previous comment

May 27 '22 18:05 jasikpark

nebula nebula copied to clipboard

Run a lighthouse on k8s

nebula
nebula copied to clipboard