ovs-cni icon indicating copy to clipboard operation
ovs-cni copied to clipboard

OVS Traffic Mirroring Proposal

Open l-rossetti opened this issue 2 years ago • 26 comments

This is a proposal for supporting the traffic mirroring feature of OVS into ovs-cni plugin. The topic has been initially discussed in issue 219. The goal is to create and manage multiple mirror ports through either ovs-cni or a dedicated cni-plugin in Network Configuration List (Multus) as defined in CNI spec 0.4.0.

l-rossetti avatar Mar 11 '22 08:03 l-rossetti

Hi @l-rossetti. Thanks for your PR.

I'm waiting for a k8snetworkplumbingwg member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Mar 11 '22 08:03 kubevirt-bot

Hey @l-rossetti, just wanted to check - is there any new development on this? Any way I can help?

phoracek avatar Apr 13 '22 12:04 phoracek

Hey @l-rossetti, just wanted to check - is there any new development on this? Any way I can help?

Hi @phoracek! We just updated the PR based on the discussion. Could please review the proposal again? If it looks ok, we can draft a first implementation of the two plugins: ovs-mirror-producer, ovs-mirror-consumer. If you agree, we can start by focusing only on the cmdAdd, leaving cmdCheck and cmdDel for a second round.

We still have a doubt about the ingress and egress keywords: in the NAD should we consider ingress and egress from the point of view of OVS or the point of view of the Pod? OVS-DB specification states that:

select_dst_port: set of weak reference to Ports
     Ports on which departing packets are selected for mirroring.
select_src_port: set of weak reference to Ports
     Ports on which arriving packets are selected for mirroring.

But here we could take the point of view of the pod if it makes sense.

Moreover, where do you suggest to put the code? Is it ok if we create two public repos in our organization with the initial draft? Do you suggest to start with only one repo for both consumer and producer, or two dedicated repo? Anyway, please, feel free to suggest what you think is more suitable to everyone.

l-rossetti avatar Apr 13 '22 14:04 l-rossetti

Apologies for my late reply.

Thanks for the update. I left only some suggestions to fix formatting, but the proposal sounds good to me.

If you find ingress and egress problematic, we can stick to OVS' naming, that works fine.

About the code, for me the ideal process would be following:

  1. You can create a quick PoC, ignoring the code quality (use bash if it helps you get it done faster), just to get a demo running to see if it will all work and get a feeling for it.
  2. Extend this repository with the new functionality. Unlike the PoC, this will require well organized code, e2e tests, documentation. We can split this step to cmdAdd, cmdDel, cmdCheck. Or even start by implementing just a subset of the API. The only condition is that after each merge, we will be able to exercise and test the code.

I think that it would be useful to keep the two new CNIs under this project, to leverage the existing infrastructure. But if you'd prefer to keep it outside, I think that at least keeping both consumer and producer under the same roof would be better over splitting it.

That being said, if you'd prefer a different workflow, I'm totally happy to discuss that.

phoracek avatar Apr 22 '22 15:04 phoracek

Hi @phoracek, we ended up with a new repository as a quick PoC containing the implementation for both mirror-producer and mirror-consumer: auroralabs-eu/ovs-cni-mirror There's a minimal implementation for cmdAdd, cmdDel and cmdCheck.

Some notes:

  • cmdCheck has never been executed (our current environment does not support spec 0.4.0)
  • in ovsdb.go we are not passing NetConf.Mirror from types.go to avoid a dependency between the two files
  • in cmd folder you can find two files, one for the producer and the other for the consumer, as you did for ovs-cni and marker
  • we had to duplicate the cleanPorts function from ovs-cni due to an unexpected behaviour we discovered (please, have a look below for more details).

CleanPorts workaround step-by-step

  • Whenever the consumer is killed, the container restarts and the CRI calls again cmdAdd for ovs-cni creating a new port/interface.
  • Then, cmdAdd is propagated to the next multus-chain element (ovs-cni-mirror-consumer).
  • When the mirror-consumer tries to add the output_port, it finds that the Mirror table already has a reference to the old output_port, cause the related cmdDel is invoked only after some time.
  • Since output_port column does not accept an array, our ovs-mirror plugin fails returning a managed error.
  • In this scenario, when ovs-mirror-consumer fails, the CRI should call the cmdDel of the upstream ovs-cni plugin, but for some reason the upstream cmdDel is never invoked, so the database becomes dirty.
  • The effect of this behavior is that ovsdb shows a bunch of ports and interfaces that actually do not exist (showing the error 'device does not exist' in the interface table). The workaround is to call the cleanPorts function at each cmdDel execution of ovs-cni-mirror-*.

Doubts:

  • We think the cleanPorts workaround is not a good solution because the port deletion should not be a responsibility of ovs-cni-mirror-*, but of ovs-cni who created the interface.
  • In our test environment we didn't see any call to ovs-cni cleanPorts. Why cleanPorts is not invoked for example at the end of each ovs-cni cmdDel?
  • Do you have any clue why the upstream cmdDel is never invoked by the CRI if the downstream plugin fails with an error on cmdAdd?
  • We had to find a workaround to correctly handle the output_port values for the Mirror table. As you can see in the comments of ovsdb.go, libovsdb returns a different type in case output_port is empty (ovsdb.OvsSet) or it contains a reference (ovsdb.UUID). For this reason we ended up with the convertToArray function. Do you have any better idea/solution?

Do you have any more suggestion/idea to improve the implementation/integration?

l-rossetti avatar Apr 26 '22 14:04 l-rossetti

This looks great!

About the cleanup issue, would it help if we called the cleanup function on the beginning of every cmdAdd too? To make sure we start with a clean state. We should do that in both OVS CNI and the mirror CNI. If a node was forcefully restarted, DEL would not be called, so we should do it in ADD.

As for why it is not executed even in DEL, we execute the cleanup function only when something does not go as expected. If DEL progresses well, we should remove all the ports explicitly. Only if there is an issue during the DEL command, we resort to best effort cleanup using cleanPorts https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/pkg/plugin/plugin.go#L519. However, that means that leftovers from other contains would not get cleaned up. Perhaps we could unconditionally call cleanPorts on every DEL.

I don't know why CRI is not running DEL, but I would suspect that they rely on ADD being transactional - if we fail ADD, we should clean up after ourselves.

I can't think of a good solution for the convertToArray issue.

I did a high level review and it looks pretty straightforward. I have just a few notes/questions:

  • Why does the consumer need previous interface? It makes sense for producers, where we want to sniff the traffic of the previous interface. But consumer interface should be not dependent on the previous one, no? I'd expect that I can create a consumer without connecting the container to the bridge https://github.com/auroralabs-eu/ovs-cni-mirror/blob/master/pkg/consumer/consumer.go#L94
  • Is it not possible to have more than one consumer on a mirror? https://github.com/auroralabs-eu/ovs-cni-mirror/blob/master/pkg/consumer/consumer.go#L106

phoracek avatar Apr 29 '22 15:04 phoracek

About the cleanup issue, would it help if we called the cleanup function on the beginning of every cmdAdd too? To make sure we start with a clean state. We should do that in both OVS CNI and the mirror CNI. If a node was forcefully restarted, DEL would not be called, so we should do it in ADD.

Yes, we agree it is better to always call it at every ADD, knowing the state is always the cleanest possible. The thing is that the clean-up should be invoked only by ovs-cni plugins, because it involves ports/interfaces, while ovs-mirror plugin should treat only mirrors and should attach the ports created by ovs-cni to the mirror itself. Is shouldn't be its responsibility to delete ports/interfaces that it didn't create.

As for why it is not executed even in DEL, we execute the cleanup function only when something does not go as expected. If DEL progresses well, we should remove all the ports explicitly. Only if there is an issue during the DEL command, we resort to best effort cleanup using cleanPorts https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/pkg/plugin/plugin.go#L519. However, that means that leftovers from other contains would not get cleaned up. Perhaps we could unconditionally call cleanPorts on every DEL.

Ok , we understand the logic behind. But yes, as before, we may check the state of the db at each ovs-cni invocation. This would simplify also the logic built around the cleanup caller in ovs-cni. On the other side may seem overkill.

I don't know why CRI is not running DEL, but I would suspect that they rely on ADD being transactional - if we fail ADD, we should clean up after ourselves.

Probably we misunderstood the spec in case of chained plugin: can't find an explicit sentence specifying that a DEL for the upstream plugin (e.g. ovs-cni) of the chain must be called by the CRI in case the downstream one (e.g. ovs-mirror) fails.

On the other hand we found this in the plugin-delegation section of the SPEC: "If, on ADD, a delegated plugin fails, the "upper" plugin should execute again with DEL before returning failure."

So this doubt comes to our minds: are we sure ovs-cni-mirror should be implemented as a dedicated plugin instead of a delegated plugin (as done for the IPAM one)? Cause actually the ovs-cni-mirror will never be called by itself but it will be always in chain with ovs-cni, cause is responsibility of the latter to create the interface/port. The sentence of the spec I quoted above would allow us to avoid the call to the cleanup function in case of ovs-mirror failure. But in that case there are some other impacts to evaluate. What do you think?

Why does the consumer need previous interface? It makes sense for producers, where we want to sniff the traffic of the previous interface. But consumer interface should be not dependent on the previous one, no? I'd expect that I can create a consumer without connecting the container to the bridge https://github.com/auroralabs-eu/ovs-cni-mirror/blob/master/pkg/consumer/consumer.go#L94

It is needed because the consumer port/interface created by ovs-cni must be added as "output_port" in the mirror table. So we need the previousResult to be able to fetch the port reference and add it to the output_port column of the table here.

Is it not possible to have more than one consumer on a mirror? https://github.com/auroralabs-eu/ovs-cni-mirror/blob/master/pkg/consumer/consumer.go#L106

Unfortunately not, using 'output_port', cause OvsDB-Schema manual specifies at page 41 that "output_port" is an "optional weak reference to Port", not a "Set of weak references to Ports".

We see only two different options to have more than one consumer for the same tarffic:

  • create multiple mirrors, one for each consumer and allow the other pods to produce to all of them., e.g.:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ovs-br-emu-cni-net1-blue-blue-ie-red-red-e
  namespace: emu-cni
  annotations:
    k8s.v1.cni.cncf.io/resourceName: ovs-cni.network.kubevirt.io/ovs-bridge
spec:
  config: |-
    {
      "cniVersion": "0.4.0",
      "plugins": [
        {
          "type": "ovs",
          "vlan": 1,
          "bridge": "br-emu-cni"
        },
        {
          "type": "ovs-cni-mirror-producer",
          "bridge": "br-emu-cni",
          "mirrors": [
            {
              "name": "emucniblue",
              "ingress": true,
              "egress": true
            },
            {
              "name": "emucnired",
              "ingress": false,
              "egress": true
            }
          ]
        }
      ]
    }
  • implement ovs-cni-mirror to support "output_vlan" that is "The frames will be sent out all ports that trunk output_vlan, as well as any ports with implicit VLAN output_vlan. When a mirrored frame is sent out a trunk port, the frame’s VLAN tag will be set to output_vlan, replacing any existing tag; when it is sent out an implicit VLAN port, the frame will not be tagged. This type of mirroring is sometimes called RSPAN."

About the latter option we excluded it in our proposal to fulfill the requirements of our project and we don't have enough knowledge about RSPAN. For sure this would be a nice to have improvement for the future.

l-rossetti avatar May 04 '22 08:05 l-rossetti

We have a question about UUIDs, because here we are using the mirrorName as done in ovs-cni with portName. However, mirrorName is a value passed by the user via NAD (also with special characters like "-" which is accepted as mirror-name by ovs), so it must be validated and cleaned to be used as "Operation.UUIDName". Instead, in ovs-cni this is not required, because portName is generated from interfaceName.

How could we create a valid UUIDName for mirrors? For example we can create a random string with a fixed length and without special characters. Do you have a better solution? Thanks

Ks89 avatar May 05 '22 09:05 Ks89

Hi @phoracek how are you doing? Any comment about the previous messages?

l-rossetti avatar Jun 20 '22 08:06 l-rossetti

Hello,

Ok , we understand the logic behind. But yes, as before, we may check the state of the db at each ovs-cni invocation. This would simplify also the logic built around the cleanup caller in ovs-cni. On the other side may seem overkill.

If this simplifies the code, I don't mind. I don't imagine it would have a terrible performance impact since we access the OVS database using the unix socket.

So this doubt comes to our minds: are we sure ovs-cni-mirror should be implemented as a dedicated plugin instead of a delegated plugin (as done for the IPAM one)? Cause actually the ovs-cni-mirror will never be called by itself but it will be always in chain with ovs-cni, cause is responsibility of the latter to create the interface/port. The sentence of the spec I quoted above would allow us to avoid the call to the cleanup function in case of ovs-mirror failure. But in that case there are some other impacts to evaluate.

In case of IPAM, the IPAM CNI returns the IP that should be configured. And then the actual IP assignment is done by the main CNI. That is unlike mirror CNI where the configuration is actually applied by the CNI.

The reference plugins under https://github.com/containernetworking/plugins/tree/main/plugins/meta are all relevant only when chained with some other plugin, like mirror. Those plugins probably suffer the same cleanup issue. Maybe we should raise that limitation as a bug of CNI, so it could be resolved for all?

I'm not totally against the idea of using delegates, but we should have a good reason to break this pattern.

About the latter option we excluded it in our proposal to fulfill the requirements of our project and we don't have enough knowledge about RSPAN. For sure this would be a nice to have improvement for the future.

+1 on keeping it simple for now with the first option.

How could we create a valid UUIDName for mirrors? For example we can create a random string with a fixed length and without special characters. Do you have a better solution?

Generating a random UUID sounds fine. I assume we can then keep the mirror name as other-config, or something similar?

If I missed any other question, please don't hesitate to bring it up.

phoracek avatar Jun 20 '22 19:06 phoracek

Hi @phoracek following your suggestion here we merged this repo into ovs-cni.

The main changes with respect to the previous commit are:

  • upgrade go to 1.18 to use generics
  • integrate MirrorNetConf in types.go and use it in config.go with GO generics to avoid code duplication
  • change "insert" operation in Mirror table with a fixed transaction uuidName HERE, because named-uuid represents a reference to inserted rows and not a real UUID as explained in RFC7047
  • as discussed previously, we removed cleanPorts from ovs-cni-mirror plugins and added both at the beginning of cmdAdd and at the end of cmdDel of ovs-cni plugin

To discuss:

  • integrate ovs-cni-mirror in your CI/CD, please could you provide more information about this topic?
  • do you think that the documentation traffic-mirroring.md is complete enough?
  • what is missing in this PR to be eligible for the merge?

Ks89 avatar Jul 07 '22 14:07 Ks89

* integrate ovs-cni-mirror in your CI/CD, please could you provide more information about this topic?

We should get at least a silver bullet test verifying that the solution works into our e2e test suite, running on Kubernetes. All that should be required is to introduce a new module under tests/, similar to https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/tests/ovs_test.go.

Rest of the test cases may be covered in integration tests, similar to this one: https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/pkg/plugin/plugin_test.go. These can run without Kubernetes and are considerably faster to execute.

* do you think that the documentation `traffic-mirroring.md` is complete enough?

I like the contents of the docs, thanks for pictures and plenty of examples! The only extra thing I'd ask for is a sentence or two about each of the cases. I may have some minor formatting comments once I get to review it in detail.

* what is missing in this PR to be eligible for the merge?

Tests and thorough review. Once you post the test coverage, I would try it out locally and then review the whole PR. I skimmed through the code and it seems that wast majority of the changes only add new code and modules, leaving existing code intact, so I don't anticipate issues there.

phoracek avatar Jul 20 '22 12:07 phoracek

/test pull-e2e-ovs-cni

phoracek avatar Aug 05 '22 10:08 phoracek

/retest

phoracek avatar Aug 05 '22 13:08 phoracek

/retest

phoracek avatar Aug 05 '22 14:08 phoracek

/retest

phoracek avatar Aug 05 '22 14:08 phoracek

@Ks89 tests are green now \o/ let me know when it will be ready for a review

phoracek avatar Aug 05 '22 14:08 phoracek

@Ks89 tests are green now \o/ let me know when it will be ready for a review

ok, thanks. I need to write some other tests, but I hope to complete everything next week.

Ks89 avatar Aug 05 '22 14:08 Ks89

/retest

Ks89 avatar Aug 10 '22 08:08 Ks89

@Ks89: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Aug 10 '22 08:08 kubevirt-bot

/retest

phoracek avatar Aug 10 '22 10:08 phoracek

Hi @phoracek, we added unit tests and e2e tests as requested and improved a little bit the code.

Some point to notice:

  • we added a new parameter to CreatePrivilegedPodWithIP to pass an additional command when starting up the pod. This is required to run tcpdump, because we tried by running it in background tcpdump ... & but it was returning error exit code 1. There is a known issue also discussed here and none of those solutions worked.
  • we have no idea why running unit tests on your infrastructure returned an error. If we run make test on our local machine, everything looks fine: 76 tests passed with no errors.
  • we have no idea why running e2e tests with cni version 0.3.1 we receive the error below on clusterApi.DeletePodsInTestNamespace(). It seems a multus error. If we try the same for both 0.3.0 and 0.4.0 everything works as expected. Also, If we remove DeletePodsInTestNamespace and we use CreateTestNamespace and RemoveTestNamespace every test it seems to be ok. Why? We noticed that current multus version in kubevirtci is 3.4.2, but the latest is 3.9. Is there a way to update it?
Warning  FailedKillPod  5s (x11 over 2m12s)  kubelet            error killing pod: failed to "KillPodSandbox" for "67e847f2-1a74-43c3-820b-c6a33e1df363" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for pod sandbox k8s_pod-prod-2_test-namespace_67e847f2-1a74-43c3-820b-c6a33e1df363_0(2bc2f7c62a00a1d5f9729d3436be2d9f754c6cb665526993ebb8a51356823402): error removing pod test-namespace_pod-prod-2 from CNI network \"multus-cni-network\": delegateDel: error invoking ConflistDel - \"ovs-net-prod\": conflistDel: error in getting result from DelNetworkList: netplugin failed with no error message"

Ks89 avatar Aug 11 '22 15:08 Ks89

Awesome, thanks! I bumped the version of Multus via https://github.com/k8snetworkplumbingwg/ovs-cni/pull/238. I'll go over the PR and the rest of your points during this week.

/retest

phoracek avatar Aug 15 '22 12:08 phoracek

/retest

phoracek avatar Aug 16 '22 12:08 phoracek

/retest-required

l-rossetti avatar Aug 17 '22 07:08 l-rossetti

@l-rossetti: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest-required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubevirt-bot avatar Aug 17 '22 07:08 kubevirt-bot

/retest

phoracek avatar Aug 17 '22 11:08 phoracek

Running the tests locally, I noticed that removal of pods using the new CNI fail due to nil pointer dereference. That causes them to get stuck terminating.

@l-rossetti @Ks89 could you please look into that? It does not fail the test but since it gets the pod stuck for several minutes, it leads into a timout of the test execution.

   Warning  FailedKillPod   8s (x2 over 22s)  kubelet            (combined from similar events): error killing pod: failed to "KillPodSandbox" for "53299780-60c3-4a98-94bd-481bd3b10f5e" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for pod sandbox k8s_pod-prod-2_test-namespace_53299780-60c3-4a98-94bd-481bd3b10f5e_0(7f744c750ec5aa5778cac31419d701f90c66ee660792d6b96c7d326637f8244b): error removing pod test-namespace_pod-prod-2 from CNI network \"multus-cni-network\": delegateDel: error invoking ConflistDel - \"ovs-net-prod\": conflistDel: error in getting result from DelNetworkList: netplugin failed: \"2022/08/19 12:13:57 CNI DEL was called for container ID: 7f744c750ec5aa5778cac31419d701f90c66ee660792d6b96c7d326637f8244b, network namespace /var/run/netns/0d7b05ac-6229-4e18-8d07-c1a4b2d833b3, interface name net1, configuration: {\\\"bridge\\\":\\\"br-test\\\",\\\"cniVersion\\\":\\\"0.3.0\\\",\\\"mirrors\\\":[{\\\"egress\\\":true,\\\"ingress\\\":true,\\\"name\\\":\\\"mirror-1\\\"}],\\\"name\\\":\\\"ovs-net-prod\\\",\\\"type\\\":\\\"ovs-cni-mirror-producer\\\"}\\npanic: runtime error: invalid memory address or nil pointer dereference\\n[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x64341a]\\n\\ngoroutine 1 [running, locked to thread]:\\npanic({0x66e240, 0x893b40})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/panic.go:941 +0x397 fp=0xc000135b40 sp=0xc000135a80 pc=0x432c97\\nruntime.panicmem(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/panic.go:220\\nruntime.sigpanic()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/signal_unix.go:818 +0x336 fp=0xc000135b90 sp=0xc000135b40 pc=0x4485b6\\ngithub.com/k8snetworkplumbingwg/ovs-cni/pkg/mirror-producer.CmdDel(0xc000158ee0)\\n\\t/home/phoracek/code/ovs-cni/pkg/mirror-producer/producer.go:132 +0x1da fp=0xc000135cd0 sp=0xc000135b90 pc=0x64341a\\ngithub.com/containernetworking/cni/pkg/skel.(*dispatcher).checkVersionAndCall(0xc000135eb8, 0xc000158ee0, {0x725d08, 0xc00015e060}, 0x6da910)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/containernetworking/cni/pkg/skel/skel.go:166 +0x20a fp=0xc000135d60 sp=0xc000135cd0 pc=0x524c8a\\ngithub.com/containernetworking/cni/pkg/skel.(*dispatcher).pluginMain(0xc000135eb8, 0x894f60?, 0xc000135ea0?, 0x44d489?, {0x725d08, 0xc00015e060}, {0xc00001c2a0, 0x25})\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/containernetworking/cni/pkg/skel/skel.go:242 +0x287 fp=0xc000135e58 sp=0xc000135d60 pc=0x5251a7\\ngithub.com/containernetworking/cni/pkg/skel.PluginMainWithError(...)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/containernetworking/cni/pkg/skel/skel.go:275\\ngithub.com/containernetworking/cni/pkg/skel.PluginMain(0x6ade80?, 0x10?, 0xc000135f50?, {0x725d08?, 0xc00015e060?}, {0xc00001c2a0?, 0x405bb9?})\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/containernetworking/cni/pkg/skel/skel.go:290 +0xd1 fp=0xc000135f08 sp=0xc000135e58 pc=0x525791\\nmain.main()\\n\\t/home/phoracek/code/ovs-cni/cmd/mirror/producer/main.go:27 +0x153 fp=0xc000135f80 sp=0xc000135f08 pc=0x643c93\\nruntime.main()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:250 +0x212 fp=0xc000135fe0 sp=0xc000135f80 pc=0x435812\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000135fe8 sp=0xc000135fe0 pc=0x461981\\n\\ngoroutine 2 [force gc (idle)]:\\nruntime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000034fb0 sp=0xc000034f90 pc=0x435bd6\\nruntime.goparkunlock(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:367\\nruntime.forcegchelper()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:301 +0xad fp=0xc000034fe0 sp=0xc000034fb0 pc=0x435a6d\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000034fe8 sp=0xc000034fe0 pc=0x461981\\ncreated by runtime.init.6\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:289 +0x25\\n\\ngoroutine 3 [GC sweep wait]:\\nruntime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000035790 sp=0xc000035770 pc=0x435bd6\\nruntime.goparkunlock(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:367\\nruntime.bgsweep(0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgcsweep.go:278 +0x8e fp=0xc0000357c8 sp=0xc000035790 pc=0x422cee\\nruntime.gcenable.func1()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:177 +0x26 fp=0xc0000357e0 sp=0xc0000357c8 pc=0x4188e6\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000357e8 sp=0xc0000357e0 pc=0x461981\\ncreated by runtime.gcenable\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:177 +0x6b\\n\\ngoroutine 4 [runnable]:\\nruntime.(*pageAlloc).scavenge(0x8b8c28, 0x10000)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgcscavenge.go:441 +0x15c fp=0xc000035f20 sp=0xc000035f18 pc=0x420e3c\\nruntime.bgscavenge(0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgcscavenge.go:336 +0x32e fp=0xc000035fc8 sp=0xc000035f20 pc=0x420bce\\nruntime.gcenable.func2()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:178 +0x26 fp=0xc000035fe0 sp=0xc000035fc8 pc=0x418886\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000035fe8 sp=0xc000035fe0 pc=0x461981\\ncreated by runtime.gcenable\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:178 +0xaa\\n\\ngoroutine 5 [finalizer wait]:\\nruntime.gopark(0x0?, 0xc000034670?, 0x70?, 0x47?, 0x442611?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000034630 sp=0xc000034610 pc=0x435bd6\\nruntime.goparkunlock(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:367\\nruntime.runfinq()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mfinal.go:177 +0xb3 fp=0xc0000347e0 sp=0xc000034630 pc=0x417993\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000347e8 sp=0xc0000347e0 pc=0x461981\\ncreated by runtime.createfing\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mfinal.go:157 +0x45\\n\\ngoroutine 6 [IO wait]:\\nruntime.gopark(0xc000101520?, 0xc00002d900?, 0x20?, 0x3c?, 0x49d342?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000093bb0 sp=0xc000093b90 pc=0x435bd6\\nruntime.netpollblock(0xc000200000?, 0x3e00?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/netpoll.go:522 +0xf7 fp=0xc000093be8 sp=0xc000093bb0 pc=0x42e997\\ninternal/poll.runtime_pollWait(0x7ff0c2782198, 0x72)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/netpoll.go:302 +0x89 fp=0xc000093c08 sp=0xc000093be8 pc=0x45cec9\\ninternal/poll.(*pollDesc).wait(0xc00010e380?, 0xc000200000?, 0x0)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/internal/poll/fd_poll_runtime.go:83 +0x32 fp=0xc000093c30 sp=0xc000093c08 pc=0x4af4b2\\ninternal/poll.(*pollDesc).waitRead(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/internal/poll/fd_poll_runtime.go:88\\ninternal/poll.(*FD).Read(0xc00010e380, {0xc000200000, 0x3e00, 0x3e00})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/internal/poll/fd_unix.go:167 +0x25a fp=0xc000093cb0 sp=0xc000093c30 pc=0x4afe1a\\nnet.(*netFD).Read(0xc00010e380, {0xc000200000?, 0xc000706090?, 0xc000093d60?})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/net/fd_posix.go:55 +0x29 fp=0xc000093cf8 sp=0xc000093cb0 pc=0x4ee1c9\\nnet.(*conn).Read(0xc00000e6a8, {0xc000200000?, 0x4d218d?, 0x0?})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/net/net.go:183 +0x45 fp=0xc000093d40 sp=0xc000093cf8 pc=0x4f8f05\\nencoding/json.(*Decoder).refill(0xc00010ac80)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/encoding/json/stream.go:165 +0x17f fp=0xc000093d90 sp=0xc000093d40 pc=0x4d5c1f\\nencoding/json.(*Decoder).readValue(0xc00010ac80)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/encoding/json/stream.go:140 +0xbb fp=0xc000093de0 sp=0xc000093d90 pc=0x4d581b\\nencoding/json.(*Decoder).Decode(0xc00010ac80, {0x657860, 0xc000012700})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/encoding/json/stream.go:63 +0x78 fp=0xc000093e10 sp=0xc000093de0 pc=0x4d5558\\ngithub.com/cenkalti/rpc2/jsonrpc.(*jsonCodec).ReadHeader(0xc0000126e0, 0xc000180000, 0xc000180018)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/cenkalti/rpc2/jsonrpc/jsonrpc.go:94 +0x85 fp=0xc000093e78 sp=0xc000093e10 pc=0x639ac5\\ngithub.com/cenkalti/rpc2.(*Client).readLoop(0xc000158fc0)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/cenkalti/rpc2/client.go:87 +0x11b fp=0xc000093fb0 sp=0xc000093e78 pc=0x63733b\\ngithub.com/cenkalti/rpc2.(*Client).Run(0x0?)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/cenkalti/rpc2/client.go:62 +0x19 fp=0xc000093fc8 sp=0xc000093fb0 pc=0x6371f9\\ngithub.com/ovn-org/libovsdb/client.(*ovsdbClient).createRPC2Client.func3()\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:238 +0x26 fp=0xc000093fe0 sp=0xc000093fc8 pc=0x63c7c6\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000093fe8 sp=0xc000093fe0 pc=0x461981\\ncreated by github.com/ovn-org/libovsdb/client.(*ovsdbClient).createRPC2Client\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:238 +0x3e5\\n\\ngoroutine 7 [GC worker (idle)]:\\nruntime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000036f58 sp=0xc000036f38 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc000036fe0 sp=0xc000036f58 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000036fe8 sp=0xc000036fe0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 8 [GC worker (idle)]:\\nruntime.gopark(0x6d32f8b6d0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000037758 sp=0xc000037738 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc0000377e0 sp=0xc000037758 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000377e8 sp=0xc0000377e0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 9 [GC worker (idle)]:\\nruntime.gopark(0x8d05e0?, 0x1?, 0x5c?, 0xaa?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000037f58 sp=0xc000037f38 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc000037fe0 sp=0xc000037f58 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000037fe8 sp=0xc000037fe0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 10 [runnable]:\\nruntime.gcMarkDone()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:775 +0x2be fp=0xc000030758 sp=0xc000030750 pc=0x4198de\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1331 +0x2c5 fp=0xc0000307e0 sp=0xc000030758 pc=0x41aba5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000307e8 sp=0xc0000307e0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 11 [GC worker (idle)]:\\nruntime.gopark(0x8d05e0?, 0x2?, 0x1e?, 0xaa?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000030f58 sp=0xc000030f38 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc000030fe0 sp=0xc000030f58 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000030fe8 sp=0xc000030fe0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 12 [GC worker (idle)]:\\nruntime.gopark(0x6d32f8eb8a?, 0x3?, 0xcf?, 0x4b?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000031758 sp=0xc000031738 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc0000317e0 sp=0xc000031758 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000317e8 sp=0xc0000317e0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 13 [chan receive]:\\nruntime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000031e40 sp=0xc000031e20 pc=0x435bd6\\nruntime.chanrecv(0xc000062300, 0x0, 0x1)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/chan.go:577 +0x56c fp=0xc000031ed0 sp=0xc000031e40 pc=0x40620c\\nruntime.chanrecv1(0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/chan.go:440 +0x18 fp=0xc000031ef8 sp=0xc000031ed0 pc=0x405c78\\ngithub.com/ovn-org/libovsdb/client.(*ovsdbClient).handleDisconnectNotification(0xc000012630)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:504 +0x57 fp=0xc000031fc8 sp=0xc000031ef8 pc=0x63dfb7\\ngithub.com/ovn-org/libovsdb/client.(*ovsdbClient).connect.func4()\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:220 +0x26 fp=0xc000031fe0 sp=0xc000031fc8 pc=0x63c226\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000031fe8 sp=0xc000031fe0 pc=0x461981\\ncreated by github.com/ovn-org/libovsdb/client.(*ovsdbClient).connect\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:220 +0x8b3\\n\\ngoroutine 14 [select]:\\nruntime.gopark(0xc0000326f0?, 0x2?, 0x0?, 0x0?, 0xc0000326dc?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000032550 sp=0xc000032530 pc=0x435bd6\\nruntime.selectgo(0xc0000326f0, 0xc0000326d8, 0x0?, 0x0, 0x0?, 0x1)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/select.go:328 +0x772 fp=0xc000032690 sp=0xc000032550 pc=0x444bf2\\ngithub.com/ovn-org/libovsdb/cache.(*eventProcessor).Run(0xc000241e00, 0xc0000622a0)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/cache/cache.go:573 +0xae fp=0xc0000327a0 sp=0xc000032690 pc=0x555b4e\\ngithub.com/ovn-org/libovsdb/cache.(*TableCache).Run(0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/cache/cache.go:480 +0x1d fp=0xc0000327c0 sp=0xc0000327a0 pc=0x5554bd\\ngithub.com/ovn-org/libovsdb/client.(*ovsdbClient).connect.func5()\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:221 +0x2a fp=0xc0000327e0 sp=0xc0000327c0 pc=0x63c1ca\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000327e8 sp=0xc0000327e0 pc=0x461981\\ncreated by github.com/ovn-org/libovsdb/client.(*ovsdbClient).connect\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:221 +0x933\\n\""

I think it comes from: https://github.com/k8snetworkplumbingwg/ovs-cni/blob/23b80f28c0a5ed076e41056aa2f652831c66fee1/pkg/mirror-producer/producer.go#L132

phoracek avatar Aug 19 '22 12:08 phoracek

/retest

phoracek avatar Aug 23 '22 07:08 phoracek

/retest

phoracek avatar Aug 23 '22 09:08 phoracek