ansible-for-kubernetes
ansible-for-kubernetes copied to clipboard
Work along issues & notes (eBook version 2020-09-01)
Posting issues, notes & suggestions as I'm working along through the book:
My system:
❯ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
❯ minikube version
minikube version: v1.18.0
commit: ec61815d60f66a6e4f6353030a40b12362557caa-dirty
Chapter 1
On Ubuntu, minikube start
now uses "docker" as the default driver, and not "virtualbox" which you imply in the sections afterwards.
To use "virtualbox" as the driver, one would need to run minikube start --driver=virtualbox
. On subsequent minikube start
s, the selected driver is saved in ~/.minikube
and doesn't need to be specified anymore.
Chapter 2
Section: Installing Ansible
-
pip
andpython-dev
are not longer available through the default repo's. The default python version for Ubuntu/Debian is currently v3.8 and thus the commands need to be replaced bypip3
andpython3-dev
- although you use Ansible v2.9.13 in the current version of the book, there's no mention of which Ansible versions would work (since v3.x was released not too long ago)
Chapter 3
Writing a Playbook to Build a Container Image
- You pin the Solr version to "8.6.2" in
vars/main.yml
but then use version "8.3.1" later on in "Writing a Playbook to Test the Container Image" where you usedocker run -d -p 8983:8983 ansible-for-kubernetes/solr:8.3.1
to start the container
Chapter 4
A Vagrantfile for local Infrastructure-as-Code
- No mention of the project directory name to use (cluster-local-vms), as you did in the previous chapters (although it's mentioned at the end of the chapter)
Running the cluster build playbook
- Running
ansible-playbook -i inventory main.yml
fails atTASK [geerlingguy.docker : Ensure dependencies are installed.]
with the error:
fatal: [kube2]: FAILED! => {"cache_update_time": 1604516116, "cache_updated": false, "changed": false, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--force-confdef\" -o \"Dpkg::Options::=--force-confold\" install 'apt-transport-https' 'gnupg2'' failed: E: Failed to fetch http://security.debian.org/debian-security/pool/updates/main/a/apt/apt-transport-https_1.8.2.1_all.deb 404 Not Found [IP: 199.232.138.132 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.debian.org/debian-security/pool/updates/main/a/apt/apt-transport-https_1.8.2.1_all.deb 404 Not Found [IP: 199.232.138.132 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.debian.org/debian-security/pool/updates/main/a/apt/apt-transport-https_1.8.2.1_all.deb 404 Not Found [IP: 199.232.138.132 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following NEW packages will be installed:\n apt-transport-https gnupg2\n0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 542 kB of archives.\nAfter this operation, 566 kB of additional disk space will be used.\nIgn:1 http://httpredir.debian.org/debian buster/main amd64 apt-transport-https all 1.8.2.1\nGet:2 http://httpredir.debian.org/debian buster/main amd64 gnupg2 all 2.2.12-1+deb10u1 [393 kB]\nErr:1 http://httpredir.debian.org/debian buster/main amd64 apt-transport-https all 1.8.2.1\n 404 Not Found [IP: 199.232.138.132 80]\nFetched 393 kB in 5s (75.7 kB/s)\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following NEW packages will be installed:", " apt-transport-https gnupg2", "0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.", "Need to get 542 kB of archives.", "After this operation, 566 kB of additional disk space will be used.", "Ign:1 http://httpredir.debian.org/debian buster/main amd64 apt-transport-https all 1.8.2.1", "Get:2 http://httpredir.debian.org/debian buster/main amd64 gnupg2 all 2.2.12-1+deb10u1 [393 kB]", "Err:1 http://httpredir.debian.org/debian buster/main amd64 apt-transport-https all 1.8.2.1", " 404 Not Found [IP: 199.232.138.132 80]", "Fetched 393 kB in 5s (75.7 kB/s)"]}
for each host (kube1, kube2, kube3). The error is probably related to the apt cache age of the used Vagrant boxed because I was able to fix it by running sudo apt update
on each box (or Ansible-style: ansible -m command -a 'sudo apt update' -i inventory all
. Running the playbook again then succeeds.
An even better solution is to add the following to the pre_tasks
of the playbook:
- name: Fix stale APT cache.
apt:
update_cache: yes
Testing the cluster with a deployment using Ansible
-
TASK [Create hello-k8s resources and wait until they are Ready.]
fails with error
failed: [kube1] (item=hello-k8s-deployment.yml) => {"ansible_loop_var": "item", "changed": false, "item": "hello-k8s-deployment.yml", "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fbf6d942350>: Failed to establish a new connection: [Errno 111] Connection refused',))"}
failed: [kube1] (item=hello-k8s-service.yml) => {"ansible_loop_var": "item", "changed": false, "item": "hello-k8s-service.yml", "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2114a50250>: Failed to establish a new connection: [Errno 111] Connection refused',))"}
Fixed this issue by downgrading/pinning the openshift version in the test-deployment.yml
playbook to v0.11.2 as outlined in this issue.
Patching Flannel to use the right network interface
- The (rather) short hint on how to create a patch could use some more explanation like specifying the exact command to use (e.g.
diff -u kube-flannel.yml kube-flannel-virtualbox.yml > kube-flannel-patch.txt
) - The flannel DaemonSet is no longer called
kube-flannel-ds-amd64
butkube-flannel-ds
Chapter 5
Authenticating to the EKS Cluster via kubeconfig
The aws-iam-authenticator
is no longer required when using aws-cli
version 1.16.156 or later
Chapter 6
(TODO in book)
Chapter 7
Manage Kind with Molecule
The default YAML files generated with molecule init scenario
seem to have changed (quite a bit). Following along and making changes as suggested result in molecule test
throwing an error:
TASK [Gathering Facts] *********************************************************
fatal: [molecule-test]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname molecule-test: Name or service not known", "unreachable": true}
I suspect this has to do with the absent connection: local
in converge.yml
because after I modified converge.yml
as outlined in the next subchapter of the book ("Test a playbook in Kind with Molecule"), the error went away.
Test a playbook in Kind with Molecule
molecule converge
successfully runs (docker ps
shows me a running container kindest/node:v1.20.2
)
but if I run kubectl get job hello
after that, I get an error:
The connection to the server localhost:8080 was refused - did you specify the right host or port?
Seems like the new kubeconfig (~/.kube/config-molecule-test
) is not being used. Solved that by running export KUBECONFIG=~/.kube/config-molecule-test
Kubernetes CI Testing in GitHub Actions
- You start off with the filename
molecule-kind.yml
and later in the subchapter you say "Once you have theci.yml
workflow file added to your repository"
Groetjes, Ivo
I would add that for chapter 2 I encountered issues running the playbook, specifically:
TASK [Create a Deployment for Hello Go.] *********************************************** fatal: [127.0.0.1]: FAILED! => {"changed": false, "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4f9c9daf40>: Failed to establish a new connection: [Errno 111] Connection refused'))"}
Tracked the issue down to my version of ansible installed, described here: https://github.com/kubernetes-client/python/issues/1333
Solution was to update my install of ansible:
sudo apt unstall ansible
sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt update
sudo apt install ansible
After updating, things worked.