cluster-lab
cluster-lab copied to clipboard
sed command failing with vagrant and Docker 1.11
While starting the cluster-lab with a vagrant up
after vagrant destroy
I get the following log output:
==> follower2: Setting up hypriot-cluster-lab-src (0.2.12-1) ...
==> follower2: Created symlink from /etc/systemd/system/multi-user.target.wants/cluster-lab.service to /etc/systemd/system/cluster-lab.service.
==> follower2: cp:
==> follower2: cannot stat ‘/etc/systemd/system/docker.service’
==> follower2: : No such file or directory
==> follower2: sed: can't read /etc/systemd/system/docker.service: No such file or directory
A docker info
against Swarm results in the following output:
root@follower1:/home/vagrant# DOCKER_HOST=tcp://192.168.200.1:2378 docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 3
(unknown): 192.168.200.45:2375
└ Status: Pending
└ Containers: 0
└ Reserved CPUs: 0 / 0
└ Reserved Memory: 0 B / 0 B
└ Labels:
└ Error: Cannot connect to the docker engine endpoint
└ UpdatedAt: 2016-06-08T05:03:41Z
(unknown): 192.168.200.1:2375
└ Status: Pending
└ Containers: 0
└ Reserved CPUs: 0 / 0
└ Reserved Memory: 0 B / 0 B
└ Labels:
└ Error: Cannot connect to the docker engine endpoint
└ UpdatedAt: 2016-06-08T04:58:31Z
(unknown): 192.168.200.26:2375
└ Status: Pending
└ Containers: 0
└ Reserved CPUs: 0 / 0
└ Reserved Memory: 0 B / 0 B
└ Labels:
└ Error: Cannot connect to the docker engine endpoint
└ UpdatedAt: 2016-06-08T05:01:01Z
Plugins:
Volume:
Network:
Kernel Version: 4.2.0-30-generic
Operating System: linux
Architecture: amd64
CPUs: 0
Total Memory: 0 B
Name: e62a0f42529d
Docker Root Dir:
Debug mode (client): false
Debug mode (server): false
WARNING: No kernel memory limit support
A docker info
against the local Docker installation results in
root@follower1:/home/vagrant# docker info
Containers: 3
Running: 3
Paused: 0
Stopped: 0
Images: 2
Server Version: 1.11.2
Storage Driver: overlay
Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge null host
Kernel Version: 4.2.0-30-generic
Operating System: Ubuntu 15.10
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 992.9 MiB
Name: follower1
ID: FJYP:QGBI:QQRC:DCXS:OEOW:36JV:JMPV:DTFV:6B6K:C4XO:PEQO:LJYE
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
A cluster-lab health
shows
root@follower1:/home/vagrant# cluster-lab health
Internet Connection
[PASS] eth1 exists
[PASS] eth1 has an ip address
[PASS] Internet is reachable
[PASS] DNS works
Networking
[PASS] eth1.200 exists
[PASS] eth1.200 has correct IP from vlan network
[PASS] Cluster leader is reachable
[PASS] eth1.200 has exactly one IP
[PASS] eth1.200 has no local link address
[PASS] Avahi process exists
[PASS] Avahi is using eth1.200
Docker
[PASS] Docker is running
[FAIL] Docker is configured to use Consul as key-value store
[FAIL] Docker is configured to listen via tcp at port 2375
[FAIL] Docker listens on 192.168.200.26 via tcp at port 2375 (Docker-Engine)
Consul
[PASS] Consul Docker image exists
[PASS] Consul Docker container is running
[PASS] Consul is listening on port 8300
[PASS] Consul is listening on port 8301
[PASS] Consul is listening on port 8302
[PASS] Consul is listening on port 8400
[PASS] Consul is listening on port 8500
[PASS] Consul is listening on port 8600
[PASS] Consul API works
[PASS] Cluster-Node is pingable with IP 192.168.200.26
[PASS] Cluster-Node is pingable with IP 192.168.200.45
[PASS] Cluster-Node is pingable with IP 192.168.200.1
[PASS] No Cluster-Node is in status 'failed'
[FAIL] Consul is able to talk to Docker-Engine on port 7946 (Serf)
Swarm
[PASS] Swarm-Join Docker container is running
[PASS] Swarm-Manage Docker container is running
[PASS] Number of Swarm and Consul nodes is equal which means our cluster is healthy
It seems the Docker daemon was not configured correctly by the cluster-lab.
I guess the problem is related to the following line: https://github.com/hypriot/cluster-lab/blob/master/package/usr/local/lib/cluster-lab/docker_lib#L79-L81
@firecyberice What do you think?
@Govinda-Fichtner @firecyberice I Might have a fix for this. Testing now.
@Govinda-Fichtner can you please add the missconfigured /etc/systemd/system/docker.service
file
@firecyberice @Govinda-Fichtner Issue was in the script that copies the service file. Was putting it in /lib instead of /etc. #49 for the fix.
root@leader:~# cluster-lab health
Internet Connection
[PASS] eth1 exists
[PASS] eth1 has an ip address
[PASS] Internet is reachable
[PASS] DNS works
Networking
[PASS] eth1.200 exists
[PASS] eth1.200 has correct IP from vlan network
[PASS] Cluster leader is reachable
[PASS] eth1.200 has exactly one IP
[PASS] eth1.200 has no local link address
[PASS] Avahi process exists
[PASS] Avahi is using eth1.200
[PASS] Avahi cluster-leader.service file exists
DNSmasq
[PASS] dnsmasq process exists
[PASS] /etc/dnsmasq.conf backup file exists
Docker
[PASS] Docker is running
[PASS] Docker is configured to use Consul as key-value store
[PASS] Docker is configured to listen via tcp at port 2375
[PASS] Docker listens on 192.168.200.1 via tcp at port 2375 (Docker-Engine)
Consul
[PASS] Consul Docker image exists
[PASS] Consul Docker container is running
[PASS] Consul is listening on port 8300
[PASS] Consul is listening on port 8301
[PASS] Consul is listening on port 8302
[PASS] Consul is listening on port 8400
[PASS] Consul is listening on port 8500
[PASS] Consul is listening on port 8600
[PASS] Consul API works
[PASS] Cluster-Node is pingable with IP 192.168.200.38
[PASS] Cluster-Node is pingable with IP 192.168.200.1
[PASS] No Cluster-Node is in status 'failed'
[PASS] Consul is able to talk to Docker-Engine on port 7946 (Serf)
Swarm
[PASS] Swarm-Join Docker container is running
[PASS] Swarm-Manage Docker container is running
[PASS] Number of Swarm and Consul nodes is equal which means our cluster is healthy