cloud-provider-openstack
cloud-provider-openstack copied to clipboard
[occm] Cannot access loadbalancer from non kubernetes nodes
Is this a BUG REPORT or FEATURE REQUEST?:
What happened: At first, I set up Openstack (xena) with octavia into openstack-server machine. I tested a loadbalancer is working well.
And then I installed kubernetes cluster(v1.22.8) by kubespray in 2 VMs which are created on above Openstack.
NAME STATUS ROLES AGE VERSION
octavia-k8s-1 Ready control-plane,master 18h v1.22.8
octavia-k8s-2 Ready <none> 18h v1.22.8
I configured kubernetes for occm (v1.22.1) and I applied several yaml files in ~/cloud-provider-openstack/manifests/controller-manager
. (cloud-controller-manager-roles.yaml, openstack-cloud-controller-manager-ds.yaml, cloud-controller-manager-role-bindings.yaml kubeadm.conf, openstack-cloud-controller-manager-pod.yaml)
Finally, I applied ~/cloud-provider-openstack/examples/loadbalancers/external-http-nginx.yaml
.
And I got external ip for external-http-nginx-service successfully.
# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
external-http-nginx-service LoadBalancer 10.233.45.4 1.2.1.242(edited) 80:31206/TCP 55s
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 19h
In kubernetes nodes, I can access the external-http-nginx-service by 1.2.1.242 well.
root@octavia-k8s-1:~# curl 1.2.1.242
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...
BUT, in other nodes(VM) in same subnet or physical machine(openstack-server), I cannot get the result by 1.2.1.242.
ubuntu@openstack-server:~$ curl 1.2.1.242
curl: (52) Empty reply from server
I guess maybe I'm missing some configuration.
What you expected to happen: In other nodes(VM) in same subnet or openstack-server
How to reproduce it:
Anything else we need to know?: I installed openstack by kolla-ansible for xena.
kubernetes master's ip a
result is as below:
root@octavia-k8s-1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:92:a7:e6 brd ff:ff:ff:ff:ff:ff
inet 10.64.0.7/24 brd 10.64.0.255 scope global dynamic ens3
valid_lft 15733sec preferred_lft 15733sec
inet6 fe80::f816:3eff:fe92:a7e6/64 scope link
valid_lft forever preferred_lft forever
3: kube-ipvs0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
link/ether d6:c8:83:d0:eb:d7 brd ff:ff:ff:ff:ff:ff
inet 10.233.0.1/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
inet 10.233.0.3/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
inet 10.233.45.4/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
inet 1.2.1.242/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
inet6 fe80::d4c8:83ff:fed0:ebd7/64 scope link
valid_lft forever preferred_lft forever
6: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1430 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
inet 10.233.79.0/32 scope global tunl0
valid_lft forever preferred_lft forever
7: calida9c852d892@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
valid_lft forever preferred_lft forever
8: cali15fc62ac7b1@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
valid_lft forever preferred_lft forever
9: nodelocaldns: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether be:93:fd:e0:a3:a8 brd ff:ff:ff:ff:ff:ff
inet 169.254.25.10/32 scope global nodelocaldns
valid_lft forever preferred_lft forever
11: cali7b3550237a5@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP group default
link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::ecee:eeff:feee:eeee/64 scope link
valid_lft forever preferred_lft forever
openstack-server's ip route
is as below:
ubuntu@openstack-server:~$ ip route
default via 1.2.1.1 dev eno3 onlink
10.64.0.0/24 via 1.2.1.244 dev eno3
10.254.0.0/24 via 1.2.1.244 dev eno3
1.2.1.0/24 dev eno3 proto kernel scope link src 1.2.1.235
1.2.1.240/28 via 1.2.1.244 dev eno3
Environment:
- openstack-cloud-controller-manager(or other related binary) version: v1.22.1
- OpenStack version: xena
- Others: ubuntu, kolla-ansible
This is weird.. LB is a VM with pre-installed haproxy by default and the ip 1.2.1.214 should be the ip of the LB I knew we had a sec group fix recently https://github.com/kubernetes/cloud-provider-openstack/issues/1830 but not sure it's related ,as you can curl from one machine but not the other seems related to firewall..
are you able to check any logs in OCCM logs and see any thing suspcious?
@jichenjc Thanks for your comment.
This is full OCCM log. I use latest occm version with log level 4.
In this log, I think this warning is suspicious. But I don't know what it means.--;
W0520 20:30:23.777141 1 openstack.go:325] Failed to create an OpenStack Secret client: unable to initialize keymanager client for region RegionOne: No suitable endpoint could be found in the service catalog.
root@octavia-k8s-1:~/cloud-provider-openstack/examples/loadbalancers# k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
external-http-nginx-service LoadBalancer 10.233.10.134 1.2.1.249 80:31217/TCP 7m28s
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 32h
Failed to create an OpenStack Secret client: unable to initialize keymanager client for region RegionOne: No suitable endpoint could be found in the service catalog.
this is ok, as it only tell us it's not able to find barican service in your catalog which is optional .
the log provided seems truncated ,and nothing special until I0522 13:46:34.568918
which is last log I can see..
have you tcpdump on the VM to anything wrong there? beyond my expertise now ... not sure someone else has background?
@jichenjc Thanks for your comment.
Do you know any openstack(with octavia) installation documents as line by line? I'll try a clean install.
I guess my network configuration may be wrong.
I think https://docs.openstack.org/devstack/latest/guides/devstack-with-lbaas-v2.html might be the easiest way to go..
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.