vpp icon indicating copy to clipboard operation
vpp copied to clipboard

Error using MetalLB in L2 mode

Open DamiaPoquet opened this issue 5 years ago • 2 comments

I've been trying to deploy a Kubernetes cluster with Contiv VPP CNI with MetalLB but I encountered the following error in the MetalLB speakers while trying to get it working.

{"caller":"main.go:301","error":"failed to unmarshal interface data for '/dump/vpp/v2/interfaces/ethernet', error unknown field \"rxModes\" in vpp_interfaces.Interface","msg":"failed to propagate node info to protocol handler","op":"setNode","protocol":"layer2","ts":"2019-09-01T16:25:53.036102033Z"}

Virtual IP addresses are being assigned to the services in LoadBalancer mode, but no traffic is getting by. Also tried arping to the assigned VIP but no response.

Additional data: contiv.conf: |- nodeToNodeTransport: vxlan useSRv6ForServices: false useDX6ForSrv6NodetoNodeTransport: false useTAPInterfaces: true tapInterfaceVersion: 2 tapv2RxRingSize: 256 tapv2TxRingSize: 256 enableGSO: true tcpChecksumOffloadDisabled: true STNVersion: 2 natExternalTraffic: true mtuSize: 1450 scanIPNeighbors: true ipNeighborScanInterval: 1 ipNeighborStaleThreshold: 4 enablePacketTrace: false routeServiceCIDRToVPP: false crdNodeConfigurationDisabled: true ipamConfig: nodeInterconnectDHCP: true podSubnetCIDR: 10.1.0.0/16 podSubnetOneNodePrefixLen: 24 vppHostSubnetCIDR: 172.30.0.0/16 vppHostSubnetOneNodePrefixLen: 24 vxlanCIDR: 192.168.30.0/24 srv6: servicePolicyBSIDSubnetCIDR: 8fff::/16 servicePodLocalSIDSubnetCIDR: 9300::/16 serviceHostLocalSIDSubnetCIDR: 9300::/16 serviceNodeLocalSIDSubnetCIDR: 9000::/16 nodeToNodePodLocalSIDSubnetCIDR: 9501::/16 nodeToNodeHostLocalSIDSubnetCIDR: 9500::/16 nodeToNodePodPolicySIDSubnetCIDR: 8501::/16 nodeToNodeHostPolicySIDSubnetCIDR: 8500::/16 controller.conf: | enableRetry: true delayRetry: 1000000000 maxRetryAttempts: 3 enableExpBackoffRetry: true delayLocalResync: 5000000000 startupResyncDeadline: 30000000000 enablePeriodicHealing: false periodicHealingInterval: 30000000000 delayAfterErrorHealing: 5000000000 remoteDBProbingInterval: 3000000000 recordEventHistory: true eventHistoryAgeLimit: 60 permanentlyRecordedInitPeriod: 10 service.conf: | cleanupIdleNATSessions: true tcpNATSessionTimeout: 180 otherNATSessionTimeout: 5 serviceLocalEndpointWeight: 1 disableNATVirtualReassembly: false

DamiaPoquet avatar Sep 01 '19 16:09 DamiaPoquet

Hi,

I'm facing the same issue with the latest Contiv version and Kubernetes 1.15. The root cause is the speaker can't enable proxy-arp which prevents VPP from answering to ARP requests for the VIP address:

``{"caller":"main.go:301","error":"failed` to unmarshal interface data for '/dump/vpp/v2/interfaces/ethernet', error unknown field "rxModes" in vpp_interfaces.Interface","msg":"failed to propagate node info to protocol handler","op":"setNode","protocol":"layer2","ts":"2020-02-23T16:11:40.27809185Z"} {"caller":"vpp_l2_controller.go:188","ethInterfaces":null,"ipAddr":"10.195.123.35","ts":"2020-02-23T16:11:45.292311407Z"} {"caller":"main.go:301","error":"failed to unmarshal interface data for '/dump/vpp/v2/interfaces/ethernet', error unknown field "rxModes" in vpp_interfaces.Interface","msg":"failed to propagate node info to protocol handler","op":"setNode","protocol":"layer2","ts":"2020-02-23T16:11:45.292424992Z"}

{"caller":"vpp_l2_controller.go:273","error":"failed to update proxyArp configuration, error context deadline `exceeded","ts":"2020-02-23T16:11:49.307391363Z"}``

If you enable proxy-arp manually then it starts working.

@DamiaPoquet , were you able to find a workaround ?

laaubert avatar Feb 23 '20 19:02 laaubert

Hi @laaubert

Sorry but it was not possible for me to find a valid workaround. I have moved to Cilium and never looked back!

DamiaPoquet avatar Feb 23 '20 20:02 DamiaPoquet