talos-vagrant
talos-vagrant copied to clipboard
Vagrant Environment for playing with Talos
This is a Vagrant Environment for a playing with Talos.
For playing with Sidero see the rgl/sidero-vagrant repository.
Table Of Contents
- Architecture
- Usage
- Network Packet Capture
-
Network Booting
- Tested Physical Machines
- Troubleshoot
- Alternatives and Related Projects
- References
Architecture
Usage
Install docker, vagrant, vagrant-libvirt, and the Ubuntu Base Box.
Login into docker hub to have a higher rate limits.
If you want to connect to the external physical network, you must configure your host network as described in rgl/ansible-collection-tp-link-easy-smart-switch (e.g. have the br-rpi
linux bridge) and set CONFIG_PANDORA_BRIDGE_NAME
in the Vagrantfile
.
Bring up the cluster virtual machines:
time ./bring-up.sh | tee bring-up.log
Access talos:
export TALOSCONFIG="$PWD/shared/talosconfig"
./shared/talosctl --nodes cp1,w1 version
Access kubernetes:
export KUBECONFIG="$PWD/shared/kubeconfig"
./shared/kubectl get nodes -o wide
Start an example service in each worker node:
vagrant ssh -c 'bash /vagrant/provision-example-daemonset.sh' pandora
Access the example service:
vagrant ssh -c "watch -n .2 'wget -qO- http://example-daemonset.\$(hostname --domain)?format=text | tail -25; kubectl get pod -l app=example-daemonset -o=custom-columns=NODE:.spec.nodeName,STATUS:.status.phase,NAME:.metadata.name'" pandora
List this repository dependencies (and which have newer versions):
export GITHUB_COM_TOKEN='YOUR_GITHUB_PERSONAL_TOKEN'
./renovate.sh
Network Packet Capture
You can easily capture and see traffic from the host with the wireshark.sh
script, e.g., to capture the traffic from the eth1
interface:
./wireshark.sh pandora eth1
Host DNS resolver
To delegate the talos.test
zone to the kubernetes managed external dns server (running in pandora) you need to configure your system to delegate that DNS zone to the pandora DNS server, for that, you can configure your system to only use dnsmasq.
For example, on my Ubuntu 22.04 Desktop, I have uninstalled resolvconf
, disabled NetworkManager
, and manually configured the network interfaces:
sudo su -l
for n in NetworkManager NetworkManager-wait-online NetworkManager-dispatcher network-manager; do
systemctl mask --now $n
done
apt-get remove --purge resolvconf
cat >/etc/network/interfaces <<'EOF'
# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback
auto enp3s0
iface enp3s0 inet dhcp
EOF
reboot
Then, replaced systemd-resolved
with dnsmasq
:
sudo su -l
apt-get install -y --no-install-recommends dnsutils dnsmasq
systemctl mask --now systemd-resolved
cat >/etc/dnsmasq.d/local.conf <<EOF
no-resolv
bind-interfaces
interface=lo
listen-address=127.0.0.1
# delegate to the Cloudflare/APNIC Public DNS IP addresses.
# NB iif there's no entry in /etc/hosts.
server=1.1.1.1
server=1.0.0.1
# delegate to the Google Public DNS IP addresses.
# NB iif there's no entry in /etc/hosts.
#server=8.8.8.8
#server=8.8.4.4
EOF
cat >/etc/dnsmasq.d/talos.test.conf <<EOF
# delegate the talos.test zone to the pandora DNS server IP address.
# NB use the CONFIG_PANDORA_IP variable value defined in the Vagrantfile.
server=/talos.test/10.10.0.2
EOF
rm /etc/resolv.conf
cat >/etc/resolv.conf <<EOF
nameserver 127.0.0.1
EOF
systemctl restart dnsmasq
exit
Then start all the machines and test the DNS resolution:
vagrant up
dig pandora.talos.test
Network Booting
This environment uses PXE/TFTP/iPXE/HTTP/UEFI-HTTP to network boot the machines.
The Virtual Machines are automatically configured to network boot.
To boot Physical Machines you have to:
- Create a Linux Bridge that can reach a Physical Switch that connects to
your Physical Machines.
- This environment assumes you have a setup like rgl/ansible-collection-tp-link-easy-smart-switch.
- To configure it otherwise you must modify the
Vagrantfile
.
- Add your machines to
machines.yaml
. - Configure your machines to PXE boot.
Tested Physical Machines
This was tested on the following physical machines and boot modes:
-
Seeed Studio Odyssey X86J4105
- It boots using UEFI/HTTP/PXE.
-
HP EliteDesk 800 35W G2 Desktop Mini
- It boots using UEFI/TFTP/PXE.
- This machine can be remotely managed with MeshCommander.
- It was configured as described at rgl/intel-amt-notes.
-
Raspberry Pi 4 (8GB)
- It boots using UEFI/HTTP/iPXE.
Notes
- The machine boot order must be
disk
andnetwork
.- Talos expects to be run from disk.
- Do not configure any default nodes with
talosctl config node
.- Instead, explicitly target the node with
talosctl -n {node}
. - Having default nodes could lead to mistakes (e.g. upgrading the whole cluster at the same time).
- Instead, explicitly target the node with
- The user only needs to access the talos control plane machines.
- A control plane machine will proxy the requests to the internal cluster nodes.
Troubleshoot
- Talos
- Troubleshooting Control Plane
-
talosctl -n cp1 service etcd status
-
talosctl -n cp1 etcd members
-
talosctl -n cp1 get members
-
talosctl -n cp1 dashboard
-
talosctl -n cp1 logs controller-runtime
-
talosctl -n cp1 logs kubelet
-
talosctl -n cp1 disks
-
talosctl -n cp1 mounts
-
talosctl -n cp1 get resourcedefinitions
-
talosctl -n cp1 get machineconfigs -o yaml
-
talosctl -n cp1 get staticpods -o yaml
-
talosctl -n cp1 get staticpodstatus
-
talosctl -n cp1 get manifests
-
talosctl -n cp1 get services
-
talosctl -n cp1 get extensions
-
talosctl -n cp1 get addresses
-
talosctl -n cp1 get nodeaddresses
-
talosctl -n cp1 list -l -r -t f /etc
-
talosctl -n cp1 list -l -r -t f /system
-
talosctl -n cp1 list -l -r -t f /var
-
talosctl -n cp1 list -l /sys/fs/cgroup
-
talosctl -n cp1 read /proc/cmdline | tr ' ' '\n'
-
talosctl -n cp1 read /proc/mounts | sort
-
talosctl -n cp1 read /etc/resolv.conf
-
talosctl -n cp1 read /etc/containerd/config.toml
-
talosctl -n cp1 read /etc/cri/containerd.toml
-
talosctl -n cp1 read /etc/cri/conf.d/cri.toml
(registry credentials) -
talosctl -n cp1 read /etc/cri/conf.d/hosts/docker.io/hosts.toml
(registry mirror) -
talosctl -n cp1 ps
-
talosctl -n cp1 containers -k
-
talos-poke cp1
- Kubernetes
-
kubectl get events --all-namespaces --watch
-
kubectl --namespace kube-system get events --watch
-
kubectl run busybox -it --rm --restart=Never --image=busybox:1.33 -- nslookup -type=a pandora.talos.test
-
Alternatives and Related Projects
References
- Talos
- Linux
- iPXE
- Raspberry Pi
- Matchbox
- Dynamic Host Configuration Protocol (DHCP)