k6-operator icon indicating copy to clipboard operation
k6-operator copied to clipboard

K6 can't launch jobs when using LinkerD

Open MarkSRobinson opened this issue 4 years ago • 6 comments

When launching jobs in a namespace that has LinkerD enabled, the starter job is unable to contact the downstream pods to trigger them. If I run the jobs in a workspace without LinkerD, everything works as expected.

starter pod logs

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 100.64.110.231 port 6565: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 100.64.110.195 port 6565: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 100.64.110.179 port 6565: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 100.64.110.113 port 6565: Connection refused

Starter pod configuration https://gist.github.com/MarkSRobinson/702ae2c4ebb8f2a02509c72c30c80531

MarkSRobinson avatar Aug 03 '21 21:08 MarkSRobinson

Hi @MarkSRobinson! Thanks for opening this. Happy to see that you're trying to run this with LinkerD!

I'm wondering, if @KnechtionsCoding has some thoughts on this, as he added support for Istio on the operator.

dgzlopes avatar Aug 05 '21 10:08 dgzlopes

The first thing I'd look at is, is the linkerd sidecar being injected on jobs (I know that was a rather recent addition to support).

Secondly, linkerd, like istio, probably doesn't support IP based routing. The latest version, being released shortly, will instead use a unique service on each pod to allow for dns based routing. So that's coming and probably required.

Third, which isn't a problem till it starts, but will be the next problem you run into the proxy won't quit unless you expressly tell it too meaning the job will never finish. See: https://tech.ingrid.com/on-linkerd-and-kubernetes-jobs/ for information on how to fix that.

knechtionscoding avatar Aug 05 '21 13:08 knechtionscoding

https://github.com/linkerd/linkerd-await appears to be an option we could look at supporting. @dgzlopes I wonder if we should have a conversation around adding a serviceMesh to the crd. And then instead of scuttle enable, or linkerd enable, a user passes in serviceMesh: istio or serviceMesh: linkerd and we generate the command for them. We could build different containers as well for each service mesh to support. So there would be a runner, controller, starter, and then for each one add a - to it

knechtionscoding avatar Sep 04 '21 02:09 knechtionscoding

Sounds interesting. Let's have a conversation on our next meeting :muscle:

dgzlopes avatar Sep 06 '21 11:09 dgzlopes