Bridge-To-Kubernetes icon indicating copy to clipboard operation
Bridge-To-Kubernetes copied to clipboard

Service pod is not getting replaced by lpkremoteagent, instead it creates a new pod with the agent

Open Anwit opened this issue 9 months ago • 0 comments

I deployed todo-app and bike-sharing app and B2K works perfectly with these two apps, both in isolated and non-isolated mode.

I have my kubernetes cluster deployed in AWS EKS cluster and I am running VsCode from MAC with apple silicon.

When I try to run B2K it does not redirect the traffic.

In non-isolated mode it does not give me any error. just it does not redirect the traffic in local machine.

 *  Executing task: bridge-to-kubernetes.resource 

Redirecting Kubernetes service api to your machine...
Target cluster: arn:aws:eks:us-west-2:<account-id>:cluster/test
Current cluster: arn:aws:eks:us-west-2:<account-id>:cluster/test
Target namespace: <namespace>
Current namespace: <namespace>
Target service name: api
Target service ports: 8080
Using kubernetes service environment variables: true

Retrieving the current context and credentials...
Validating the credentials to access the cluster...
Validating the requirements to replicate resources locally...
Redirecting traffic from the cluster to your machine...
Waiting for 'api-79cbd49f-sjztm' in namespace 'hive' to reach running state...
Deployment '<namespace>/api' patched to run agent.
Remote agent deployed in container 'api' in pod 'api-79cbd49f-sjztm'.
Preparing to run Bridge To Kubernetes configured as pod <namespace>/api-79cbd49f-sjztm ...
Connection established.
Service 'site-api' is available on 127.0.0.1:55049.
Service 'dns-gateway' is available on 127.0.0.1:55050.
Service 'pod-policy-enforcer' is available on 127.0.0.1:55051.
Service 'proxy' is available on 127.0.0.1:55052.
Service 'api' is available on 127.0.0.1:55053.
Service 'dns-gateway-lb' is available on 127.0.0.1:55054.
Container port 8080 is available at localhost:8080.
##################### Environment started. #############################################################
Run /var/folders/rw/w3jy7w895kn2bfnrq185y2t40000gq/T/tmp-650387o5JT5eBLTyn.env.cmd in your existing console to also get connected.
 *  Terminal will be reused by tasks, press any key to close it. 
...
...
2024-05-23T12:25:00.799175Z  INFO common::server: Server listening on [::]:8080

After comparing todo-app and non-working app I see in non-working app the service pod is NOT getting replaced by lpkremoteagent. It simply creates the a new pod with lpkremoteagent along with a restore pod.

bridge-library-2024-05-23-12-35-49-77973.txt

➜  Bridge To Kubernetes kubectl get pods -o wide
NAME                                         READY   STATUS    RESTARTS        AGE     IP                             NODE                                         NOMINATED NODE   READINESS GATES
api-6466769d8b-skc7f                         1/1     Running   0               22h     2600:1f14:3abd:cd03:1ae6::4    ip-10-3-62-40.us-west-2.compute.internal     <none>           2/2
api-7df9bc54d4-r8hjc                         1/1     Running   0               9m36s   2600:1f14:3abd:cd03:342::3     ip-10-3-135-148.us-west-2.compute.internal   <none>           0/2
api-restore-14c25-gv7dq                      1/1     Running   5 (2m25s ago)   9m32s   2600:1f14:3abd:cd03:342::6     ip-10-3-135-148.us-west-2.compute.internal   <none>           <none>

api-6466769d8b-skc7f: contains my image. READINESS GATES 2/2 api-7df9bc54d4-r8hjc : contains bridgetokubernetes.azurecr.io/lpkremoteagent:1.3.4 . READINESS GATES 0/2 api-restore-14c25-gv7dq: Restore pod.. it is restarting continuously. (Oscillating between CrashLoopBackOff and Running)

logs from restore pod:

2024-05-23T13:52:41.1843206Z | RestorationJob | WARNG | Found 2 pods for deployment hive/api but expected 1
2024-05-23T13:52:41.1844724Z | RestorationJob | TRACE | Event: RestorationJob-AgentPing <json>{"eventName":"RestorationJob-AgentPing","properties":{"RestorePerformed":"false","NumFailedPings":"10","HasConnectedClients":"","Result":"Failed"},"metrics":{"DurationInMs":10}}</json>
2024-05-23T13:52:46.1887574Z | RestorationJob | TRACE | Dependency: Kubernetes <json>{"name":"Kubernetes","target":"GetV1DeploymentAsync","success":true,"duration":null,"properties":{}}</json>
2024-05-23T13:52:46.2035798Z | RestorationJob | TRACE | Dependency: Kubernetes <json>{"name":"Kubernetes","target":"ListPodsInNamespaceAsync","success":true,"duration":null,"properties":{}}</json>
2024-05-23T13:52:46.2036510Z | RestorationJob | TRACE | Dependency: Kubernetes <json>{"name":"Kubernetes","target":"ListPodsForDeploymentAsync","success":true,"duration":null,"properties":{}}</json>
2024-05-23T13:52:46.2037109Z | RestorationJob | WARNG | Found 2 pods for deployment hive/api but expected 1
2024-05-23T13:52:46.2038360Z | RestorationJob | TRACE | Event: RestorationJob-AgentPing <json>{"eventName":"RestorationJob-AgentPing","properties":{"RestorePerformed":"false","NumFailedPings":"11","HasConnectedClients":"","Result":"Failed"},"metrics":{"DurationInMs":18}}</json>
2024-05-23T13:52:46.2041707Z | RestorationJob | ERROR | Failed to ping agent 12 times. Exiting...
2024-05-23T13:52:46.2267060Z | RestorationJob | TRACE | AssemblyLoadContext unloading
2024-05-23T13:52:46.2345022Z | RestorationJob | TRACE | Process exiting...
2024-05-23T13:52:46.2347077Z | RestorationJob | TRACE | Process exited

Looking for help to debug this issue.

For isolated env I am still facing the exact same issue described here: https://github.com/microsoft/mindaro/issues/192

Anwit avatar May 23 '24 12:05 Anwit