OVN load balancer backend health checks
Issue description
When creating a load balancer, incus doesn't have options to configure health checks, although OVN have capabilities to do it.
Steps to create health check manually
# Be sure that OVN NB and SB can be reached and don't use sockets if you have a cluster
export OVN_NB_DB="tcp:[REDACTED].196:6641,tcp:[REDACTED].4:6641,tcp:[REDACTED].68:6641"
export OVN_SB_DB="tcp:[REDACTED].196:6642,tcp:[REDACTED].4:6642,tcp:[REDACTED].68:6642"
# find your load balancer [LBID], [LBVIP] and [LBPORT]
ovn-nbctl list Load_Balancer
# get [LBHCID]
ovn-nbctl --id=@lbhc create Load_Balancer_Health_Check vip='"[LBVIP]:[LBPORT]"' -- set Load_Balancer [LBID] health_check=@lbhc
# Create your LB HC
ovn-nbctl set Load_Balancer_Health_Check [LBHCID] options:interval=5 options:timeout=20 options:success_count=3 options:failure_count=3
# Find your switch (you can match one of your instance mac_address with the port mac), and take note of the ports
ovn-nbctl show
# In case of 3 instances
ovn-nbctl --wait=sb set load_balancer [LBID] ip_port_mappings:[INSTANCE_IP]=[INSTANCE_PORT_ABOVE]:[SOURCE_IP_USED_TO_CHECK_THE_INSTANCE]
ovn-nbctl --wait=sb add load_balancer [LBID] ip_port_mappings [INSTANCE_IP]='"[INSTANCE_PORT_ABOVE]:[SOURCE_IP_USED_TO_CHECK_THE_INSTANCE]"'
ovn-nbctl --wait=sb add load_balancer [LBID] ip_port_mappings [INSTANCE_IP]='"[INSTANCE_PORT_ABOVE]:[SOURCE_IP_USED_TO_CHECK_THE_INSTANCE]"'
This will create the service monitors:
# ovn-sbctl list service_monitor
_uuid : 65dcfe21-4a0e-4812-aed7-61c1d463744e
external_ids : {}
ip : "[REDACTED].228"
logical_port : incus-net7-instance-a6699d89-8c0d-477e-8056-fcd8bd9a2e3f-eth0
options : {failure_count="3", interval="5", success_count="3", timeout="20"}
port : 1234
protocol : tcp
src_ip : "[REDACTED].230"
src_mac : "02:4f:77:db:48:ff"
status : online
_uuid : 9b3b0b6a-cc59-45c1-8919-ff3ec7cd97fb
external_ids : {}
ip : "[REDACTED].226"
logical_port : incus-net7-instance-b84d37b2-038e-4c42-a326-c432b7e33ba6-eth0
options : {failure_count="3", interval="5", success_count="3", timeout="20"}
port : 1234
protocol : tcp
src_ip : "[REDACTED].230"
src_mac : "02:4f:77:db:48:ff"
status : online
_uuid : e3827923-5d26-4128-b3bc-7bf6f07c1d27
external_ids : {}
ip : "[REDACTED].227"
logical_port : incus-net7-instance-797e88cb-2235-47ac-8b12-264355f53f43-eth0
options : {failure_count="3", interval="5", success_count="3", timeout="20"}
port : 1234
protocol : tcp
src_ip : "[REDACTED].230"
src_mac : "02:4f:77:db:48:ff"
status : online
Got it working locally here over IPv4. I'll have to check IPv6 next.
One thing to note, picking the right source address for the check is important. I first tried using the gateway address on the LSP for the check, this didn't work at all and broke connectivity for the instances. Using a completely random IPv4 address also didn't work properly and neither did using an address on the uplink subnet.
Seems like the best bet is to use an otherwise unused address on the virtual network's subnet which isn't ideal as we don't really have one of those we can just use... I'll have to look at the OVN documentation to see what can be done there. We may need to forbid the use of the last address of the network's subnet and then use that address for the tests.
It's working if I select one IP from the ipv4.routes set in the UPLINK --type=physical network. My UPLINK config:
config:
bgp.peers.router01.address: [REDACTED].9.2
bgp.peers.router01.asn: "64661"
bgp.peers.router02.address: [REDACTED].9.3
bgp.peers.router02.asn: "64661"
dns.nameservers: [REDACTED].80.77,[REDACTED].80.136,[REDACTED].80.202
ipv4.gateway: [REDACTED].9.1/24
ipv4.ovn.ranges: [REDACTED].9.50-[REDACTED].9.254
ipv4.routes: [REDACTED].12.0/22
mtu: "9000"
ovn.ingress_mode: routed
volatile.last_state.created: "false"
My OVNs, at this moment are:
[REDACTED].12.193/27
[REDACTED].12.1/25
[REDACTED].12.233/29
[REDACTED].12.129/26
[REDACTED].12.225/29
And my LBs, all working:
[REDACTED].15.202
[REDACTED].15.203
[REDACTED].15.206
Hmm, so in my setup, I now have:
root@ovn:~# incus network show UPLINK
config:
ipv4.address: 10.171.5.1/24
ipv4.dhcp.ranges: 10.171.5.2-10.171.5.99
ipv4.nat: "true"
ipv4.ovn.ranges: 10.171.5.100-10.171.5.254
ipv4.routes: 169.254.169.254/32
ipv6.address: fd42:bd75:85b2:5e3c::1/64
ipv6.nat: "true"
description: ""
name: UPLINK
type: bridge
used_by:
- /1.0/networks/default
managed: true
status: Created
locations:
- none
project: default
And in OVN:
root@ovn:~# ovn-nbctl list Load_balancer
_uuid : b7a2486c-47cd-48b2-aa87-8e5de2d3012a
external_ids : {}
health_check : [32777ff5-ecd6-424b-87a3-62b2531ff644]
ip_port_mappings : {"10.135.209.2"="incus-net49-instance-22be9150-ce1e-4bac-901c-e9f8319c38a8-eth0:169.254.169.254", "10.135.209.3"="incus-net49-instance-e81541fd-a700-4b03-a14c-701774844857-eth0:169.254.169.254", "10.135.209.4"="incus-net49-instance-b9f27759-e9a2-49ac-aa28-e88948855f73-eth0:169.254.169.254"}
name : incus-net49-lb-10.171.5.10-tcp
options : {}
protocol : tcp
selection_fields : []
vips : {"10.171.5.10:22"="10.135.209.3:22,10.135.209.4:22,10.135.209.2:22"}
root@ovn:~# ovn-nbctl list Load_balancer_Health_Check
_uuid : 32777ff5-ecd6-424b-87a3-62b2531ff644
external_ids : {}
options : {failure_count="3", interval="5", success_count="3", timeout="20"}
vip : "10.171.5.10:22"
root@ovn:~#
This configuration doesn't actually work, in the guest I see:
22:37:49.309715 IP 169.254.169.254.50139 > 10.135.209.2.22: Flags [S], seq 2023238226, win 65160, length 0
22:37:49.309756 IP 10.135.209.2.22 > 169.254.169.254.50139: Flags [S.], seq 61768099, ack 2023238227, win 64240, options [mss 1460], length 0
22:37:50.327277 IP 10.135.209.2.22 > 169.254.169.254.50139: Flags [S.], seq 61768099, ack 2023238227, win 64240, options [mss 1460], length 0
22:37:52.343271 IP 10.135.209.2.22 > 169.254.169.254.50139: Flags [S.], seq 61768099, ack 2023238227, win 64240, options [mss 1460], length 0
22:37:55.799272 IP 10.135.209.2.22 > 169.254.169.254.17266: Flags [S.], seq 3609160431, ack 924126814, win 64240, options [mss 1460], length 0
22:37:56.567277 IP 10.135.209.2.22 > 169.254.169.254.50139: Flags [S.], seq 61768099, ack 2023238227, win 64240, options [mss 1460], length 0
So it's getting the check, but the response doesn't make it back to OVN, despite it being configured to receive that 169.254.169.254 address.
If I change it to:
root@ovn:~# ovn-nbctl list Load_balancer
_uuid : b7a2486c-47cd-48b2-aa87-8e5de2d3012a
external_ids : {}
health_check : [32777ff5-ecd6-424b-87a3-62b2531ff644]
ip_port_mappings : {"10.135.209.2"="incus-net49-instance-22be9150-ce1e-4bac-901c-e9f8319c38a8-eth0:10.135.209.254", "10.135.209.3"="incus-net49-instance-e81541fd-a700-4b03-a14c-701774844857-eth0:10.135.209.254", "10.135.209.4"="incus-net49-instance-b9f27759-e9a2-49ac-aa28-e88948855f73-eth0:10.135.209.254"}
name : incus-net49-lb-10.171.5.10-tcp
options : {}
protocol : tcp
selection_fields : []
vips : {"10.171.5.10:22"="10.135.209.3:22,10.135.209.4:22,10.135.209.2:22"}
root@ovn:~# ovn-nbctl list Load_balancer_Health_Check
_uuid : 32777ff5-ecd6-424b-87a3-62b2531ff644
external_ids : {}
options : {failure_count="3", interval="5", success_count="3", timeout="20"}
vip : "10.171.5.10:22"
root@ovn:~#
Then things behave again and I get this check:
root@u1:~# tcpdump -ni eth0
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
22:40:54.659073 IP 10.135.209.254.63228 > 10.135.209.2.22: Flags [S], seq 2868151362, win 65160, length 0
22:40:54.659117 IP 10.135.209.2.22 > 10.135.209.254.63228: Flags [S.], seq 2903410373, ack 2868151363, win 64240, options [mss 1460], length 0
22:40:54.659435 IP 10.135.209.254.63228 > 10.135.209.2.22: Flags [R], seq 2868151363, win 65160, length 0
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
And the health check state looks good:
root@ovn:~# ovn-sbctl list service_monitor
_uuid : fc2efc35-9ddf-4c36-be1b-4a2ab9a48978
chassis_name : "cf6e0a79-3dd0-4bac-8a99-718aaf76ff40"
external_ids : {}
ip : "10.135.209.3"
logical_port : incus-net49-instance-e81541fd-a700-4b03-a14c-701774844857-eth0
options : {failure_count="3", interval="5", success_count="3", timeout="20"}
port : 22
protocol : tcp
src_ip : "10.135.209.254"
src_mac : "02:75:b9:2e:ad:f4"
status : online
_uuid : 68af335a-358b-431a-ad7f-6826b5da7564
chassis_name : "cf6e0a79-3dd0-4bac-8a99-718aaf76ff40"
external_ids : {}
ip : "10.135.209.2"
logical_port : incus-net49-instance-22be9150-ce1e-4bac-901c-e9f8319c38a8-eth0
options : {failure_count="3", interval="5", success_count="3", timeout="20"}
port : 22
protocol : tcp
src_ip : "10.135.209.254"
src_mac : "02:75:b9:2e:ad:f4"
status : online
_uuid : ed45c6ff-2ad5-4cd3-9f33-94fae017239d
chassis_name : "cf6e0a79-3dd0-4bac-8a99-718aaf76ff40"
external_ids : {}
ip : "10.135.209.4"
logical_port : incus-net49-instance-b9f27759-e9a2-49ac-aa28-e88948855f73-eth0
options : {failure_count="3", interval="5", success_count="3", timeout="20"}
port : 22
protocol : tcp
src_ip : "10.135.209.254"
src_mac : "02:75:b9:2e:ad:f4"
status : offline
root@ovn:~#
So I think I'll go with the easy option of reserving the last address of the subnet for internal use and then use that for the checks.
On IPv6, I'll likely just use the EUI64 of the OVN logical router's MAC.
Got it working on IPv6 too.
Okay, so I'll look at adding logic to ban the use of the last IPv4 address in the range which should give us a reliable source test address for this stuff.
Then after that I'll look at adding the logic needed to put checks in place.
Stéphane, I mixed up stuff in my reply, my bad.
The ?.?.15.? IPs are the load balancer IPs and not the source health check IPs. The source health check IP, I picked a random unused IP from the OVN, which today could cause issues if you bring a new container/vm, that IP can be allocated and will lead to unpredictable results.
The idea to reserve the last one, will fix this.
All good then, I'll proceed with my plan to use the EUI64 for IPv6 (computed from LR MAC) and use the last IPv4 on the IPv4 side, tweaking our allocation logic so we never allocate that last IPv4 address to an instance.