kubernetes-pfsense-controller
kubernetes-pfsense-controller copied to clipboard
pfsense getting constant updates
Haproxy on pfsense keep getting reloaded, leading haproxy not being able to hold a connection.
As you can see from the logs bellow. Every second or so it goes and updates pfsense. What could I be doing wrong?
2022-02-23T22:40:28+00:00 plugin (haproxy-declarative): successfully reloaded HAProxy service
2022-02-23T22:40:28+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071070
2022-02-23T22:40:28+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071071
2022-02-23T22:40:28+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071096
2022-02-23T22:40:28+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/test/Ingress/mysite-ingress MODIFIED - 8070722
2022-02-23T22:40:28+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/cattle-system/Ingress/rancher MODIFIED - 8070723
2022-02-23T22:40:28+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/test/Ingress/mysite-ingress MODIFIED - 8070726
2022-02-23T22:40:31+00:00 plugin (haproxy-declarative): successfully reloaded HAProxy service
2022-02-23T22:40:31+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071097
2022-02-23T22:40:31+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071129
2022-02-23T22:40:31+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071130
2022-02-23T22:40:31+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/cattle-system/Ingress/rancher MODIFIED - 8070728
2022-02-23T22:40:31+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/cattle-system/Ingress/rancher MODIFIED - 8070731
2022-02-23T22:40:31+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/cattle-system/Ingress/rancher MODIFIED - 8070755
2022-02-23T22:40:31+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/test/Ingress/mysite-ingress MODIFIED - 8070756
2022-02-23T22:40:33+00:00 plugin (haproxy-declarative): successfully reloaded HAProxy service
2022-02-23T22:40:33+00:00 plugin (haproxy-ingress-proxy): successfully reloaded HAProxy service
2022-02-23T22:40:33+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071153
2022-02-23T22:40:33+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071155
2022-02-23T22:40:33+00:00 plugin (pfsense-dns-services): /v1/namespaces/kube-system/Service/traefik MODIFIED - 8071192
2022-02-23T22:40:33+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/cattle-system/Ingress/rancher MODIFIED - 8070757
2022-02-23T22:40:33+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/test/Ingress/mysite-ingress MODIFIED - 8070758
2022-02-23T22:40:33+00:00 plugin (pfsense-dns-haproxy-ingress-proxy): /networking.k8s.io/v1/namespaces/cattle-system/Ingress/rancher MODIFIED - 8070760
I have a "simple" setup currently with a rancher service and one test service. I'm running k3s v1.22.3+k3s1 with 3 servers and 3 agents in a HA config. For HA I'm using kube-vip and then using metallb for service load balancing. Finally traefik for ingress. Currently haproxy on pfsense doing certification management and SSL offloading. This issue seems to caused by k3s thinking there is a change then triggering this project to go update pfsense.
Let me know if you need more details about my setup.
Welcome! Something is triggering constantly updates on the ingress and service which is pretty abnormal. I'm not familiar with kube-vip
but based on a quick look my guess is that kube-vip
and metallb
are 'fighting' each other over who 'owns' the LoadBalancer
IP address.
I would suggest running something like kubectl get svc -A | grep LoadBalancer
under watch
(or something like this kubectl get svc -A --watch
) and see if the IP is flapping constantly.
You are correct, I miss read the documentation https://kube-vip.chipzoller.dev/docs/installation/daemonset/ After removing metallb, no more DDOSing pfsense. However, I might have configured something wrong. k3s not picking up the VIP and this project one of the control node as the entry point. This defeats use of VIP for redundancy. I might mess around with turning off service option for kube-vip and try mteallb again on top kube-vip.
Sounds good. Let me know how it goes. We’ll leave this open until we know everything is good.
Update: I played with this lot more and removed kube-vip altogether from the equation. Looks like traefik causing Metallb handout a ip constantly. I haven't figure out why yet.
{"caller":"level.go:63","event":"ipAllocated","ip":"172.16.2.30","level":"info","msg":"IP address assigned by controller","service":"kube-system/traefik","ts":"2022-02-25T16:05:23.060373128Z"}
{"caller":"level.go:63","event":"serviceUpdated","level":"info","msg":"updated service object","service":"kube-system/traefik","ts":"2022-02-25T16:05:23.066429678Z"}
{"caller":"level.go:63","event":"ipAllocated","ip":"172.16.2.30","level":"info","msg":"IP address assigned by controller","service":"kube-system/traefik","ts":"2022-02-25T16:05:23.06652935Z"}
{"caller":"level.go:63","error":"Operation cannot be fulfilled on services \"traefik\": the object has been modified; please apply your changes to the latest version and try again","level":"error","msg":"failed to update service status","op":"updateServiceStatus","service":"kube-system/traefik","ts":"2022-02-25T16:05:23.071830824Z"}
{"caller":"level.go:63","event":"ipAllocated","ip":"172.16.2.30","level":"info","msg":"IP address assigned by controller","service":"kube-system/traefik","ts":"2022-02-25T16:05:25.038477461Z"}
{"caller":"level.go:63","event":"serviceUpdated","level":"info","msg":"updated service object","service":"kube-system/traefik","ts":"2022-02-25T16:05:25.050305976Z"}
{"caller":"level.go:63","event":"ipAllocated","ip":"172.16.2.30","level":"info","msg":"IP address assigned by controller","service":"kube-system/traefik","ts":"2022-02-25T16:05:27.038608675Z"}
{"caller":"level.go:63","event":"serviceUpdated","level":"info","msg":"updated service object","service":"kube-system/traefik","ts":"2022-02-25T16:05:27.045857899Z"}
{"caller":"level.go:63","event":"ipAllocated","ip":"172.16.2.30","level":"info","msg":"IP address assigned by controller","service":"kube-system/traefik","ts":"2022-02-25T16:05:29.039585392Z"}
{"caller":"level.go:63","event":"serviceUpdated","level":"info","msg":"updated service object","service":"kube-system/traefik","ts":"2022-02-25T16:05:29.05151219Z"}
{"caller":"level.go:63","event":"ipAllocated","ip":"172.16.2.30","level":"info","msg":"IP address assigned by controller","service":"kube-system/traefik","ts":"2022-02-25T16:05:31.036852784Z"}
{"caller":"level.go:63","event":"serviceUpdated","level":"info","msg":"updated service object","service":"kube-system/traefik","ts":"2022-02-25T16:05:31.048244625Z"}
{"caller":"level.go:63","event":"ipAllocated","ip":"172.16.2.30","level":"info","msg":"IP address assigned by controller","service":"kube-system/traefik","ts":"2022-02-25T16:05:33.045624429Z"}
{"caller":"level.go:63","event":"serviceUpdated","level":"info","msg":"updated service object","service":"kube-system/traefik","ts":"2022-02-25T16:05:33.057657115Z"}
What about using kube-vip for service load balancing as well? I managed to get this working but for whatever reason kube-vip not updating k3s about the ip address it handed out so I have to manually read the logs and update pfsense.
Akube-vip
plugin likely would not be very difficult to add. If you get it up and going and working as expected I'm happy to take a look.
Found this https://rancher.com/docs/k3s/latest/en/networking/#disabling-the-service-lb so that was my issue and after I disabled servicelb everything worked as expected. Now time to dig into kube-vip
Ah! So you had 3 things fighting over the ip…that’s a disaster for sure. Are you going to reinstall metallb now or stick with just kube-vip?
Thanks for the help, I did mess around with kube-vip more but I cannot get it to play well. I'm going to leave it to that and use metallb. One last question, I have not seen any configuration option to use two or more shared frontends. I got two shared frontends configured in haproxy for WAN and LAN side. This helps me to do proper SSL without needing to expose all of the services to public. Any suggestions for this without running two instances of this project?
Give me a bit more detail on the setup and desired outcome if you don’t mind and I’ll see if it’s possible.
Sure, I got two frontends binded to two interfaces. WAN and LAN. This helps with separating rules for internal and public services.
similar to this
frontend shared-https-merged
bind xxx.xxx.xxx.131:443 namexxx.xxx.xxx.131:443 ssl crt-list /var/etc/haproxy/shared-https.crt_list
mode http
log global
option httpclose
timeout client 30000
rspidel ^Server:.*$
acl aclcrt_shared-https var(txn.txnhost) -m reg -i ^([^\.]*)\.example\.com(:([0-9]){1,5})?$
acl ACL1 var(txn.txnhost) -m str -i nextcloud.example.com
acl ACL11 var(txn.txnhost) -m str -i pass.example.com
acl ACL20 var(txn.txnhost) -m str -i ha.example.com
use_backend nextcloud.example.com_ipvANY if ACL1
use_backend pass.example.com_ipvANY if ACL11
use_backend ha.example.com_ipvANY if ACL20
use_backend bad_backend_ipvANY if aclcrt_shared-https
frontend shared-https-local-merged
bind 172.16.1.1:443 name 172.16.1.1:443 ssl crt-list /var/etc/haproxy/shared-https-local.crt_list
mode http
log global
option httpclose
option forwardfor
acl https ssl_fc
http-request set-header X-Forwarded-Proto http if !https
http-request set-header X-Forwarded-Proto https if https
timeout client 30000
rspidel ^Server:.*$
acl aclcrt_shared-https-local var(txn.txnhost) -m reg -i ^([^\.]*)\.example\.com(:([0-9]){1,5})?$
acl ACL4 var(txn.txnhost) -m str -i home.example.com
acl ACL5 var(txn.txnhost) -m str -i unifi.example.com
acl ACL2 var(txn.txnhost) -m beg -i nextcloud.example.com
acl ACL14 var(txn.txnhost) -m str -i bi.example.com
acl ACL15 var(txn.txnhost) -m str -i primary.example.com
acl ACL16 var(txn.txnhost) -m str -i secondary.example.com
acl ACL17 var(txn.txnhost) -m str -i plex.example.com
acl ACL17 var(txn.txnhost) -m str -i plex.direct
acl ACL18 var(txn.txnhost) -m str -i syno.example.com
acl ACL19 var(txn.txnhost) -m str -i ha.example.com
acl ACLGRAFANA var(txn.txnhost) -m str -i grafana.example.com
acl ACLESPHOME var(txn.txnhost) -m str -i esphome.example.com
acl PASS_LOCAL_ACL var(txn.txnhost) -m str -i pass.example.com
use_backend home.example.com_ipvANY if ACL4
use_backend unifi.example.com_ipvANY if ACL5
use_backend nextcloud.example.com_ipvANY if ACL2
use_backend bi.example.com_ipvANY if ACL14
use_backend primary.example.com_ipvANY if ACL15
use_backend secondary.example.com_ipvANY if ACL16
use_backend plex.example.com_ipvANY if ACL17
use_backend synology.example.com_ipvANY if ACL18
use_backend ha.example.com_ipvANY if ACL19
use_backend grafana.example.com_ipvANY if ACLGRAFANA
use_backend esphome.example.com_ipvANY if ACLESPHOME
use_backend pass.example.com_ipvANY if PASS_LOCAL_ACL
use_backend bad_backend_ipvANY if aclcrt_shared-https-local
I do not want everything to be exposed to public, specially projects I'm currently working on. However it would be nice to have them work with proper SSL certificates. So I just need that particular project to be using the shared-https-local
.
I'm guessing you're referring to the haproxy-ingress-proxy
feature. If so this bit from the README is probably what you're after:
Optionally, on the ingress resources you can set the following annotations:
haproxy-ingress-proxy.pfsense.org/frontend
andhaproxy-ingress-proxy.pfsense.org/backend
to respectively set the frontend and backend to override the defaults.
Thats not going to solve the issue. it replaces the default. As you can see, my haproxy has duplicate entries for Local and Public. Why? This allows me to do different rules for local vs public. On top of that my public dns points to cloudflare and they have limits to bandwidth single file size limits etc... Having local traffic directly going to proxy entry point bypasses all of that.
You want the same ingress to be added to 2 frontends?
Yes
I have this implemented. Anything else needed?
How you go about doing it? Setting a default and setting a annotation(haproxy-ingress-proxy.pfsense.org/frontend) at the same time?
I haven't committed it yet, but it just supports a comma-separated list instead of a single entry.
Thank you so much. Let me know if you want me to test it.
Released in v0.5.12
.
Any luck testing this out?
I just tested it, works as expected. Thank you so much!
hey just peeping this convo, I was wondering were you able to get this to work with kube-vip without metallb? should it just work with this controller- or is some additional integration required?
Let’s open another issue for kube-vip support. Currently I think the only real dependency on metallb is it checks for the configmap (which is arbitrary, it doesn’t use any data from it). Some minor adjustments can be made and it should work fine with kube-vip as well.
v0.5.13
has removed any need for metallb at all. The plugin is still named metallb
(for now) but it simply manages bgp peer by pushing cluster nodes to pfSense.