ovn-kubernetes
ovn-kubernetes copied to clipboard
make svc_template_support configurable, default is enabled
- What this PR does and why is it needed
- Special notes for reviewers
- How to verify it
- Description for the changelog
Deploy Preview for subtle-torrone-bb0c84 ready!
Name | Link |
---|---|
Latest commit | da4a3741d8027b04c9b9a6624fd422004f2078d4 |
Latest deploy log | https://app.netlify.com/sites/subtle-torrone-bb0c84/deploys/6622c4dc25c244000883e562 |
Deploy Preview | https://deploy-preview-4293--subtle-torrone-bb0c84.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
Is there a reason why we don't want to use LB Templates?
Changes unknown when pulling da4a3741d8027b04c9b9a6624fd422004f2078d4 on cathy-zhou:svc_template_configurable into ** on ovn-org:master**.
Is there a reason why we don't want to use LB Templates?
There are several issues we found downstream. @cathy-zhou can you please raise issues for:
- Performance regression while using service templates.
- If a K8s nodes have different set of IP addresses, then the template doesn't work as expected. The ovn-controller on nodes with lesser number of IPs will interpret the template variable as a 'raw string' and spews out lot of errors.
Is there a reason why we don't want to use LB Templates?
There are several issues we found downstream. @cathy-zhou can you please raise issues for:
- Performance regression while using service templates.
- If a K8s nodes have different set of IP addresses, then the template doesn't work as expected. The ovn-controller on nodes with lesser number of IPs will interpret the template variable as a 'raw string' and spews out lot of errors.
For 2, I just filed https://github.com/ovn-org/ovn-kubernetes/issues/4301. We are still investigating the performance issue and will file one once we know more.
@cathy-zhou can you please rebase?
- Performance regression while using service templates. @hzhou8 do you have corresponding issue filed for the offloading performance issue when svc_template is enabled? if so, can you post the link to it?
Please also add this to https://github.com/ovn-org/ovn-kubernetes/blob/master/helm/ovn-kubernetes/values.yaml so helm can benefit from it. @flavio-fernandes please take a look
- Performance regression while using service templates. @hzhou8 do you have corresponding issue filed for the offloading performance issue when svc_template is enabled? if so, can you post the link to it?
I don't have an issue reported yet, but here are the details:
When LB template is enabled, a priority 120 flow is missing in the UNSNAT stage that skips the UNSNAT for LB VIPs that are also used as SNAT IP. The flow was added by: 832893bdbb42 ("ovn-northd: Skip unsnat flows for load balancer vips in router ingress pipeline")
The LB template implementation in OVN didn't handle this properly, and the flow is missing if template is used. (It is actually difficult to handle with template, so this can be a limitation of LB template) - Because of this missing flow, traffic to node-port is not offloaded, and performance drops significantly. But this is only for our downstream environment. The full story is more complex:
The above commit 832893bdbb42 itself has a problem, and this priority-120 flow is again removed by a bug fix: ffe267317c25 ("northd: Don't skip the unSNAT stage for traffic towards VIPs.") This bug fix is also backported to 23.06 but not to our downstream branch yet.
So, enabling LB template resulted in HW offload problem in our downstream where the above bug fix doesn't exist, and disabling LB template restores the HW offload. However, for current upstream branches with the above fix ffe267317c25, it is very likely that the HW offload problem is present regardless of the LB template. This is yet to be verified. If it is verified, then we will need to provide a fix in OVN, and we are not sure yet if we can make LB template work with HW offload, which depends on the new fix in OVN.
Please also add this to https://github.com/ovn-org/ovn-kubernetes/blob/master/helm/ovn-kubernetes/values.yaml so helm can benefit from it. @flavio-fernandes please take a look
awesome. TYSM!
LGTM