cloud-provider-openstack icon indicating copy to clipboard operation
cloud-provider-openstack copied to clipboard

How do I use routes?

Open vbezhenar opened this issue 2 years ago • 4 comments

I've installed cloud-controller-manager and I'm researching about routes features. I didn't find any mentions about router configuration in the docs, but in sources there's routes.go with some functionality. I hoped that it'd allow CNI to send routes to openstack router and avoid using network encapsulation like VxLan, etc.

I'm using the following snippet in cloud.conf:

    [Global]
    auth-url="https://auth.pscloud.io/v3/"
    application-credential-id="$OS_APPLICATION_CREDENTIAL_ID"
    application-credential-secret="$OS_APPLICATION_CREDENTIAL_SECRET"
    region="kz-ala-1"
    
    [Metadata]
    search-order="metadataService"

    [Route]
    router-id=27fdfeba-1f16-41bd-9610-6ca7b74d1c2b

Router ID is identifier from openstack UI for router that's connected to the network Kubernetes running in.

With this configuration controllers are crashing and I'm getting the following scary but not very informative log:

I0712 15:15:32.553250       1 controllermanager.go:272] Starting "service"
I0712 15:15:32.558110       1 openstack.go:336] Claiming to support LoadBalancer
I0712 15:15:32.558370       1 controllermanager.go:291] Started "service"
I0712 15:15:32.558446       1 controller.go:233] Starting service controller
I0712 15:15:32.558461       1 shared_informer.go:255] Waiting for caches to sync for service
I0712 15:15:32.558558       1 controllermanager.go:272] Starting "route"
I0712 15:15:32.868259       1 openstack.go:449] Claiming to support Routes
E0712 15:15:32.868293       1 controllermanager.go:275] Error starting "route"
F0712 15:15:32.868298       1 controllermanager.go:180] error running controllers: failed to parse cidr value:"" with error:invalid CIDR address: 
goroutine 225 [running]:
k8s.io/klog/v2.stacks(0x1)
        /home/runner/work/cloud-provider-openstack/cloud-provider-openstack/.go/pkg/mod/k8s.io/klog/[email protected]/klog.go:860 +0x8a
k8s.io/klog/v2.(*loggingT).output(0x306f3a0, 0x3, 0x0, 0xc0000f8690, 0x1, {0x257807f, 0x1}, 0x306ff00, 0x0)
        /home/runner/work/cloud-provider-openstack/cloud-provider-openstack/.go/pkg/mod/k8s.io/klog/[email protected]/klog.go:825 +0x686
k8s.io/klog/v2.(*loggingT).printfDepth(0x306f3a0, 0x2041a28, 0x0, {0x0, 0x0}, 0xc000487920, {0x1da6686, 0x1d}, {0xc000726820, 0x1, ...})
        /home/runner/work/cloud-provider-openstack/cloud-provider-openstack/.go/pkg/mod/k8s.io/klog/[email protected]/klog.go:630 +0x1f2
k8s.io/klog/v2.(*loggingT).printf(...)
        /home/runner/work/cloud-provider-openstack/cloud-provider-openstack/.go/pkg/mod/k8s.io/klog/[email protected]/klog.go:612
k8s.io/klog/v2.Fatalf(...)
        /home/runner/work/cloud-provider-openstack/cloud-provider-openstack/.go/pkg/mod/k8s.io/klog/[email protected]/klog.go:1516
k8s.io/cloud-provider/app.Run.func1({0x20593d0, 0xc0007b6e00}, 0xc000113400)
        /home/runner/work/cloud-provider-openstack/cloud-provider-openstack/.go/pkg/mod/k8s.io/[email protected]/app/controllermanager.go:180 +0x345
k8s.io/cloud-provider/app.Run.func2({0x20593d0, 0xc0007b6e00})
        /home/runner/work/cloud-provider-openstack/cloud-provider-openstack/.go/pkg/mod/k8s.io/[email protected]/app/controllermanager.go:225 +0xe4
created by k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run
        /home/runner/work/cloud-provider-openstack/cloud-provider-openstack/.go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/leaderelection.go:211 +0x154

goroutine 1 [select (no cases)]:

and then huge stacktrace follows.

I tried to google about this feature, but it's not mentioned anywhere. Except in logs if I remove this router-id configuation, it's mentioned very shortly that router-id is missing (which is why I started to research this at all).

Without router-id everything works just fine.

vbezhenar avatar Jul 12 '22 15:07 vbezhenar

I hoped that it'd allow CNI to send routes to openstack router and avoid using network encapsulation like VxLan, etc.

don't fully understand the use case.. the router is L3 concept and likely in VxLAN /Geneve , so if you want to avoid such then I think use provider network (L2 concept) is the thing you should use? thus router-id is not something you should set?

anyway, need clearer view on the goal you want to achieve.. thanks

jichenjc avatar Jul 13 '22 01:07 jichenjc

RIght now I'm using calico with encapsulation mode. As I understand it: all IP packets from pod network are encapsulated using something called VXLAN and those encapsulated packets are sent to a target node, then unwrapped and delivered to the target pod. This encapsulation process introduces overhead and isolates pod network from node network (so I can't ping my pods from node, for example).

Calico by default does not do encapsulation and as I understand it, uses BGP to announce routes for pod network to the router. Router is supposed to receive those routes and then use them to route packets between pods. It does not work for me, at least with my provider. I thought that if I would be able to announce those routes via openstack router API instead of BGP, it would work, and I expected this routes.go code to do exactly that.

I don't manage openstack installation, I'm more like client of it, so I can't configure server side.

If I'm wrong, then I guess that's not an issue to me. But still there's code in repo, there's warning in logs about unconfigured option (if not configured), so it'd be nice to have some documentation about this feature.

vbezhenar avatar Jul 13 '22 20:07 vbezhenar

I didn't write the route, but to solve your case , kubectl edit daemonset openstack-cloud-controller-manager -n kube-system and add this line

      containers:
      - args:
        - /bin/openstack-cloud-controller-manager
        - --v=1
        - --cluster-name=$(CLUSTER_NAME)
        - --cloud-config=$(CLOUD_CONFIG)
        - --cloud-provider=openstack
        - --use-service-account-credentials=true
        - --bind-address=127.0.0.1
        - --cluster-cidr=10.0.7.0/20 ===> depends on your settings, add this line and your desired CIDR

then you can avoid the error you saw and you will see something like follwing in the log, need more insight on the routes itself due to not so much expertise on it for now

I0716 08:33:03.745333       1 route_controller.go:194] Creating route for node capi-quickstart-md-0-5lnfl 192.168.1.0/24 with hint 6d82fd41-5fea-4a12-b4aa-32c9a2b8977e, throttled 16.589µs
I0716 08:33:03.745343       1 route_controller.go:194] Creating route for node capi-quickstart-control-plane-ffmnt 192.168.0.0/24 with hint 47910415-4bf0-461c-8c19-e26d0c057eca, throttled 11.087µs
I0716 08:33:05.638241       1 route_controller.go:214] Created route for node capi-quickstart-md-0-5lnfl 192.168.1.0/24 with hint 6d82fd41-5fea-4a12-b4aa-32c9a2b8977e after 1.892902135s
I0716 08:33:07.051831       1 route_controller.go:214] Created route for node capi-quickstart-control-plane-ffmnt 192.168.0.0/24 with hint 47910415-4bf0-461c-8c19-e26d0c057eca after 3.306492795s

jichenjc avatar Jul 16 '22 08:07 jichenjc

this is for my own reference and maybe some expert can comment to help as well

the routes seems create route on given router which means it makes pod inside tenant able to communicate through given router , I doubt whether it's needed if CNI like calico is used .. need more dig

openstack router show 5ea61b43-a36a-4bd9-922f-5cd1765dc0f7
....
| routes                  | destination='192.168.0.0/24', gateway='10.6.0.142'                                                                                                                                                                                                                           |
|                         | destination='192.168.1.0/24', gateway='10.6.0.184'   

jichenjc avatar Jul 18 '22 00:07 jichenjc

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 16 '22 01:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Nov 15 '22 01:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Dec 15 '22 01:12 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Dec 15 '22 01:12 k8s-ci-robot