pixie icon indicating copy to clipboard operation
pixie copied to clipboard

Server error

Open zhuw0312 opened this issue 1 year ago • 5 comments

Describe the bug

web ui show

image

please help us to solve this problem ,thanks all.

all pods is in running state

NAME                                                              READY   STATUS      RESTARTS   AGE
93bc2253481b44a32e04a523286b130a01e7a6e2168db1be5c02993f19vbl2r   0/1     Completed   0          5h10m
pixie-operator-index-nt5rk                                        1/1     Running     0          5h9m
vizier-operator-7765b9784-kcv4h                                   1/1     Running     0          39m

[root@wf-dev-mid-k8s-m ~]# kubectl -n plc get po
NAME                                      READY   STATUS    RESTARTS       AGE
api-server-65d8c5d54b-szr4z               1/1     Running   10 (22h ago)   22h
artifact-tracker-server-747d85d96-zhr9c   1/1     Running   0              22h
auth-server-59fdf999f7-4d96j              1/1     Running   0              22h
cloud-proxy-545669b4b7-btkxh              2/2     Running   13 (22h ago)   22h
config-manager-server-66cf8576bf-6m7hl    1/1     Running   0              22h
cron-script-server-7787d46979-t8mxw       1/1     Running   7 (22h ago)    22h
hydra-5dcd57b658-kcqs8                    2/2     Running   0              22h
indexer-server-6bc8cc4f84-4tgm9           1/1     Running   10 (22h ago)   22h
kratos-75897f7b5d-hw9vl                   2/2     Running   0              22h
metrics-server-5dbc9f9fdf-k8k2k           1/1     Running   7 (22h ago)    22h
pl-elastic-es-master-0                    1/1     Running   0              22h
pl-elastic-es-master-1                    1/1     Running   0              22h
pl-nats-0                                 1/1     Running   0              22h
pl-nats-1                                 1/1     Running   0              22h
pl-nats-2                                 1/1     Running   0              22h
plugin-server-798b5f779-jm6k6             1/1     Running   0              22h
postgres-5d7b556bf5-crk64                 1/1     Running   1 (22h ago)    22h
profile-server-84d86766fb-nsrhh           1/1     Running   0              22h
project-manager-server-6ccddb9ff-qtpsp    1/1     Running   0              22h
scriptmgr-server-77b8f9f4dd-5jq4g         1/1     Running   0              22h
vzconn-server-56fbf5b5d8-w7qdk            1/1     Running   10 (22h ago)   22h
vzmgr-server-59dcd6c585-hdsv5             1/1     Running   10 (22h ago)   22h

[root@wf-dev-mid-k8s-m ~]# kubectl -n plm get po
No resources found in plm namespace.
[root@wf-dev-mid-k8s-m ~]# kubectl -n olm get po
NAME                                READY   STATUS    RESTARTS   AGE
catalog-operator-6b6c99fdfd-nqhq2   1/1     Running   0          21h
olm-operator-7b4494bd75-x8kf4       1/1     Running   0          21h

zhuw0312 avatar Jul 14 '23 08:07 zhuw0312

Hi @zhuw0312 ! May I get some more details about your cloud deployment? Did you just follow the instructions for deploying self-hosted cloud without any modifications? What kind of Kubernetes environment are you running this on?

aimichelle avatar Jul 14 '23 16:07 aimichelle

i found error los of vizier-query-broker-9f84f55d9-wx6cb in pl namespace as below:

time="2023-07-17T13:33:56+08:00" level=info msg="Running script" query_id=892aa252-4a1e-4132-93bc-7c409f92ff80
time="2023-07-17T13:34:01+08:00" level=info msg="Running script" query_id=1c9b3537-f781-430d-97f9-37de9e03c42d
time="2023-07-17T13:34:01+08:00" level=info msg="Launched query: 312dbe88-0271-462a-89ca-f3db2696c856"
time="2023-07-17T13:34:01+08:00" level=info msg="Running script" query_id=312dbe88-0271-462a-89ca-f3db2696c856
time="2023-07-17T13:34:01+08:00" level=info msg="Launched query: 56755cd2-03d2-40eb-9d7a-0d2b016024b7"
time="2023-07-17T13:34:01+08:00" level=info msg="Running script" query_id=56755cd2-03d2-40eb-9d7a-0d2b016024b7
time="2023-07-17T13:34:01+08:00" level=info msg="Launched query: 467ed8b2-09f7-4939-bc52-2a6a47d18050"
time="2023-07-17T13:34:01+08:00" level=info msg="Running script" query_id=467ed8b2-09f7-4939-bc52-2a6a47d18050
time="2023-07-17T13:34:01+08:00" level=error msg="failed to execute query" duration=58.020467ms error="rpc error: code = Internal desc = OTel export (carnot node_id=92) failed with error 'UNKNOWN'. Details: Failed to create secure client channel " query_id=467ed8b2-09f7-4939-bc52-2a6a47d18050
time="2023-07-17T13:34:01+08:00" level=warning msg="finished streaming call with code Internal" error="rpc error: code = Internal desc = OTel export (carnot node_id=92) failed with error 'UNKNOWN'. Details: Failed to create secure client channel " fields.time=58.893289ms grpc.code=Internal grpc.method=ExecuteScript grpc.service=px.api.vizierpb.VizierService grpc.start_time="2023-07-17T13:34:01+08:00" peer.address="[::1]:60212" span.kind=server system=grpc
time="2023-07-17T13:34:01+08:00" level=error msg="failed to execute query" duration=73.617731ms error="rpc error: code = Internal desc = OTel export (carnot node_id=260) failed with error 'UNKNOWN'. Details: Failed to create secure client channel " query_id=56755cd2-03d2-40eb-9d7a-0d2b016024b7
time="2023-07-17T13:34:01+08:00" level=warning msg="finished streaming call with code Internal" error="rpc error: code = Internal desc = OTel export (carnot node_id=260) failed with error 'UNKNOWN'. Details: Failed to create secure client channel " fields.time=74.281972ms grpc.code=Internal grpc.method=ExecuteScript grpc.service=px.api.vizierpb.VizierService grpc.start_time="2023-07-17T13:34:01+08:00" peer.address="[::1]:60212" span.kind=server system=grpc
time="2023-07-17T13:34:01+08:00" level=error msg="failed to execute query" duration=81.917934ms error="rpc error: code = Internal desc = OTel export (carnot node_id=116) failed with error 'UNKNOWN'. Details: Failed to create secure client channel " query_id=312dbe88-0271-462a-89ca-f3db2696c856
time="2023-07-17T13:34:01+08:00" level=warning msg="finished streaming call with code Internal" error="rpc error: code = Internal desc = OTel export (carnot node_id=116) failed with error 'UNKNOWN'. Details: Failed to create secure client channel " fields.time=84.028092ms grpc.code=Internal grpc.method=ExecuteScript grpc.service=px.api.vizierpb.VizierService grpc.start_time="2023-07-17T13:34:01+08:00" peer.address="[::1]:60212" span.kind=server system=grpc
time="2023-07-17T13:34:06+08:00" level=info msg="Running script" query_id=9b8b5fe2-ad93-4ba2-87a3-f9552b1be2eb

and the error logs of api-server-65d8c5d54b-tbfjr in namespace plc

time="2023-07-17T13:34:16+08:00" level=warning msg="[core] grpc: addrConn.createTransport failed to connect to {  <nil> <nil> 0 <nil>}. Err: connection error: desc = \"transport: Error while dialing dial tcp: missing address\"" system=system
time="2023-07-17T13:34:16+08:00" level=info msg="[core] Subchannel Connectivity change to TRANSIENT_FAILURE" system=system
time="2023-07-17T13:34:16+08:00" level=info msg="[balancer] base.baseBalancer: handle SubConn state change: 0xc00086f4e0, CONNECTING" system=system
time="2023-07-17T13:34:16+08:00" level=info msg="[balancer] base.baseBalancer: handle SubConn state change: 0xc00086f4e0, TRANSIENT_FAILURE" system=system
time="2023-07-17T13:36:12+08:00" level=info msg="[core] Subchannel Connectivity change to IDLE" system=system
time="2023-07-17T13:36:12+08:00" level=info msg="[balancer] base.baseBalancer: handle SubConn state change: 0xc00086f4e0, IDLE" system=system
time="2023-07-17T13:36:12+08:00" level=info msg="[core] Subchannel Connectivity change to CONNECTING" system=system
time="2023-07-17T13:36:12+08:00" level=info msg="[core] Subchannel picks a new address \"\" to connect" system=system
time="2023-07-17T13:36:12+08:00" level=warning msg="[core] grpc: addrConn.createTransport failed to connect to {  <nil> <nil> 0 <nil>}. Err: connection error: desc = \"transport: Error while dialing dial tcp: missing address\"" system=system
time="2023-07-17T13:36:12+08:00" level=info msg="[core] Subchannel Connectivity change to TRANSIENT_FAILURE" system=system
time="2023-07-17T13:36:12+08:00" level=info msg="[balancer] base.baseBalancer: handle SubConn state change: 0xc00086f4e0, CONNECTING" system=system
time="2023-07-17T13:36:12+08:00" level=info msg="[balancer] base.baseBalancer: handle SubConn state change: 0xc00086f4e0, TRANSIENT_FAILURE" system=system

yes , i follow the instructions for deploying self-hosted cloud without any modifications.

my k8s env:

cni:cillium
ver:1.24.2 

what else info you need ? pls tell me .

zhuw0312 avatar Jul 17 '23 05:07 zhuw0312

can i use traefik to exposed cloud_ingress_grpc ?

zhuw0312 avatar Jul 17 '23 07:07 zhuw0312

after used nginx-ingrss-controller,it's ok

zhuw0312 avatar Jul 17 '23 08:07 zhuw0312

@aimichelle I have deployed the selfhosted pixie cloud and tried to access the web UI, I am getting cannot be reached. When I do port forward on 56000 and tried to access from local on http://localhost:56000, I am getting 404. I followed this documentation for the installation: https://docs.px.dev/installing-pixie/install-guides/production-readiness/ I have updated PASSTHROUGH_PROXY_PORT: "" in k8s/cloud/public/base/domain_config.yaml file. For the PL_DOMAIN_NAME, do I need to change it to my DNS or keep it same(dev.withpixie.dev)

sabideep1 avatar Jul 27 '23 16:07 sabideep1