cosmo icon indicating copy to clipboard operation
cosmo copied to clipboard

"SpanId" error after Helm Chart upgrade to 0.13.0

Open lancethomas1 opened this issue 7 months ago • 16 comments

Component(s)

controlplane

Component version

See list in description

wgc version

0.71.4

controlplane version

See list in description

router version

See list in description

What happened?

Description

After upgrading our Cosmo stack via Helm Chart, the traces in Studio are returning internal error.

Image

this is the log we are seeing from this error "level":"error","time":"2025-04-03T16:18:03.829Z","pid":1,"hostname":"cosmo-controlplane-5c6bbdb479-5hvsm","msg":"Code: 47. DB::Exception: Unknown expression identifier 'SpanId' in scope SELECT TraceId AS traceId, SpanId AS spanId, toString(toUnixTimestamp(Timestamp)) AS unixTimestamp, OperationName AS operationName, OperationType AS operationType, Duration AS durationInNano, StatusCode AS statusCode, StatusMessage AS statusMessage, OperationContent AS operationContent, HttpStatusCode AS httpStatusCode, HttpHost AS httpHost, HttpUserAgent AS httpUserAgent, HttpMethod AS httpMethod, HttpTarget AS httpTarget, OperationPersistedID AS operationPersistedId, OperationHash AS operationHash, ClientName AS clientName, ClientVersion AS clientVersion, IF(empty(OperationPersistedID), false, true) AS isPersisted FROM cosmo.traces WHERE (Timestamp >= toDateTime(_CAST([1743610683](tel:1743610683), 'UInt64'))) AND (Timestamp <= toDateTime(_CAST([1743697083](tel:1743697083), 'UInt64'))) AND (FederatedGraphID = '71dc7892-879e-4693-92a5-b4dedf10ee5c') AND (OrganizationID = '08a5f0a0-3b29-4d1c-8ee8-7e26f2ce1533') ORDER BY unixTimestamp DESC LIMIT _CAST(0, 'Int16'), _CAST(20, 'Int16'). Maybe you meant: ['spanId']. (UNKNOWN_IDENTIFIER) (version 24.6.2.17 (official build))\n"} {"level":"error","time":"2025-04-03T16:18:03.829Z","pid":1,"hostname":"cosmo-controlplane-5c6bbdb479-5hvsm","reqId":"req-mfv","service":"wg.cosmo.platform.v1.PlatformService","method":"GetAnalyticsView","actor":{"userId":"f79b04ea-ecf5-4560-8c73-b81364d68330","organizationId":"08a5f0a0-3b29-4d1c-8ee8-7e26f2ce1533"}} {"level":"info","time":"2025-04-03T16:18:03.829Z","pid":1,"hostname":"cosmo-controlplane-5c6bbdb479-5hvsm","reqId":"req-mfv","res":{"statusCode":500},"responseTime":33.45413112640381,"msg":"request completed"}

seeing Exception: Unknown expression identifier 'SpanId' vs Maybe you meant: ['spanId'].

curl from endpoint that's failing: curl --location 'https://controlplane.graphql.mesh.dkp2.tst.aws-digital.rccl.com/wg.cosmo.platform.v1.PlatformService/GetAnalyticsView?connect=v1&encoding=json&message=%7B%22federatedGraphName%22%3A%22rcg-federation%22%2C%22config%22%3A%7B%22range%22%3A4%2C%22pagination%22%3A%7B%22limit%22%3A20%7D%2C%22sort%22%3A%7B%22id%22%3A%22unixTimestamp%22%2C%22desc%22%3Atrue%7D%7D%2C%22namespace%22%3A%22default%22%7D' \ --header 'accept: */*' \ --header 'accept-language: en-US,en;q=0.9' \ --header 'cache-control: no-cache' \ --header 'cosmo-org-slug: rccl' \ --header 'origin: removed' \ --header 'priority: u=1, i' \ --header 'referer: removed' \ --header 'sec-ch-ua: "Chromium";v="134", "Not:A-Brand";v="24", "Google Chrome";v="134"' \ --header 'sec-ch-ua-mobile: ?0' \ --header 'sec-ch-ua-platform: "macOS"' \ --header 'sec-fetch-dest: empty' \ --header 'sec-fetch-mode: cors' \ --header 'sec-fetch-site: same-site' \ --header 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36' \ --header 'Cookie: removed

response:

Image

Steps to Reproduce

upgrade with change sets v0.13.0 login to Cosmo Studio Click on analytics click on traces

Expected Result

see traces

Actual Result

blank screen with internal error

Component Versions

Cosmo Version | 0.13.0 router | 0.189.0 control plane | 0.125.0 cdn | 0.14.1 studio | 0.102.0 otel collector | 0.18.1 clickhouse | 6.2.14 keycloak | 22.0.0 minio build-version | 14.6.25 postgresql | 12.12.10 redis | 19.3.3 graphqlmetrics | 0.33.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04") Package Manager: pnpm, npm, yarn, etc Compiler(if manually compiled): (e.g., "go 14.2")

Router configuration


Router execution config


Log output

"level":"error","time":"2025-04-03T16:18:03.829Z","pid":1,"hostname":"cosmo-controlplane-5c6bbdb479-5hvsm","msg":"Code: 47. DB::Exception: Unknown expression identifier 'SpanId' in scope SELECT TraceId AS traceId, SpanId AS spanId, toString(toUnixTimestamp(Timestamp)) AS unixTimestamp, OperationName AS operationName, OperationType AS operationType, Duration AS durationInNano, StatusCode AS statusCode, StatusMessage AS statusMessage, OperationContent AS operationContent, HttpStatusCode AS httpStatusCode, HttpHost AS httpHost, HttpUserAgent AS httpUserAgent, HttpMethod AS httpMethod, HttpTarget AS httpTarget, OperationPersistedID AS operationPersistedId, OperationHash AS operationHash, ClientName AS clientName, ClientVersion AS clientVersion, IF(empty(OperationPersistedID), false, true) AS isPersisted FROM cosmo.traces WHERE (Timestamp >= toDateTime(_CAST(1743610683, 'UInt64'))) AND (Timestamp <= toDateTime(_CAST(1743697083, 'UInt64'))) AND (FederatedGraphID = '71dc7892-879e-4693-92a5-b4dedf10ee5c') AND (OrganizationID = '08a5f0a0-3b29-4d1c-8ee8-7e26f2ce1533') ORDER BY unixTimestamp DESC LIMIT _CAST(0, 'Int16'), _CAST(20, 'Int16'). Maybe you meant: ['spanId']. (UNKNOWN_IDENTIFIER) (version 24.6.2.17 (official build))\n"}
{"level":"error","time":"2025-04-03T16:18:03.829Z","pid":1,"hostname":"cosmo-controlplane-5c6bbdb479-5hvsm","reqId":"req-mfv","service":"wg.cosmo.platform.v1.PlatformService","method":"GetAnalyticsView","actor":{"userId":"f79b04ea-ecf5-4560-8c73-b81364d68330","organizationId":"08a5f0a0-3b29-4d1c-8ee8-7e26f2ce1533"}}
{"level":"info","time":"2025-04-03T16:18:03.829Z","pid":1,"hostname":"cosmo-controlplane-5c6bbdb479-5hvsm","reqId":"req-mfv","res":{"statusCode":500},"responseTime":33.45413112640381,"msg":"request completed"}

Additional context

No response

lancethomas1 avatar Apr 03 '25 19:04 lancethomas1

WunderGraph commits fully to Open Source and we want to make sure that we can help you as fast as possible. The roadmap is driven by our customers and we have to prioritize issues that are important to them. You can influence the priority by becoming a customer. Please contact us here.

github-actions[bot] avatar Apr 03 '25 19:04 github-actions[bot]

Hi @lancethomas1, thanks for your contribution. We'll look into it and come back.

StarpTech avatar Apr 08 '25 09:04 StarpTech

Hi @StarpTech wanted to check in on this. Any updates?

lancethomas1 avatar Apr 11 '25 12:04 lancethomas1

Hello @StarpTech Just following up—any updates on this?

pranjan-rccl avatar Apr 24 '25 17:04 pranjan-rccl

@StarpTech were you able to identify any issues here? We are upgrading with helm chart 0.13.1 and we are getting this same error regarding spanId.

lancethomas1 avatar May 28 '25 16:05 lancethomas1

@StarpTech, as mentioned by @lancethomas1 , we're encountering the same error in Splunk.

Here are the current component versions:

COSMO_HELM_CHART_VERSION: 0.13.1 router: 0.197.1 control plane: 0.133.1 cdn: 0.14.1 studio: 0.111.0 otel collector: 0.18.1 clickhouse: 24.6.2-debian-12-r3 keycloak: 25.0.2-debian-12-r0 minio: 2024.7.16-debian-12-r0 postgresql: 15.4.0-debian-11-r56 redis: redis:7.2.4-debian-12-r16 graphqlmetrics: 0.33.0

Please let us know if any additional information is needed.

pranjan-rccl avatar May 30 '25 08:05 pranjan-rccl

Hi, when you say "upgrade," from which version did you upgrade?

StarpTech avatar May 30 '25 15:05 StarpTech

We are moving from version 0.12.2 to 0.13.1

fsartori68 avatar May 30 '25 18:05 fsartori68

@StarpTech

Issue: Analytics Traces Blank After Upgrade to v0.13.1

After upgrading our Cosmo stack via Helm Chart from 0.12.2 to 0.13.1, the traces in Studio are returning 500 internal server error.

Image

Error Log (from control plane):

evel":"info","time":"2025-05-28T15:43:21.712Z","pid":1,"hostname":"cosmo-controlplane-67cbdc4dc-5p5lq","reqId":"req-1kt","req":{"method":"GET","url":"/wg.cosmo.platform.v1.PlatformService/GetAnalyticsView?connect=v1&encoding=json&message=%7B%22federatedGraphName%22%3A%22rcg-federation%22%2C%22config%22%3A%7B%22range%22%3A24%2C%22pagination%22%3A%7B%22limit%22%3A20%7D%2C%22sort%22%3A%7B%22id%22%3A%22unixTimestamp%22%2C%22desc%22%3Atrue%7D%7D%2C%22namespace%22%3A%22default%22%7D","hostname":"controlplane.graphql.mesh.dkp2.tst.aws-digital.rccl.com","remoteAddress":"192.168.88.166","remotePort":59194},"msg":"incoming request"} {"level":"info","time":"2025-05-28T15:43:21.713Z","pid":1,"hostname":"cosmo-controlplane-67cbdc4dc-5p5lq","reqId":"req-1ku","req":{"method":"POST","url":"/wg.cosmo.platform.v1.PlatformService/GetFederatedGraphSDLByName","hostname":"controlplane.graphql.mesh.dkp2.tst.aws-digital.rccl.com","remoteAddress":"192.168.88.166","remotePort":59210},"msg":"incoming request"} {"level":"error","time":"2025-05-28T15:43:21.740Z","pid":1,"hostname":"cosmo-controlplane-67cbdc4dc-5p5lq","msg":"Code: 47. DB::Exception: Unknown expression identifier 'SpanId' in scope SELECT TraceId AS traceId, SpanId AS spanId, toString(toUnixTimestamp(Timestamp)) AS unixTimestamp, OperationName AS operationName, OperationType AS operationType, Duration AS durationInNano, StatusCode AS statusCode, StatusMessage AS statusMessage, OperationContent AS operationContent, HttpStatusCode AS httpStatusCode, HttpHost AS httpHost, HttpUserAgent AS httpUserAgent, HttpMethod AS httpMethod, HttpTarget AS httpTarget, OperationPersistedID AS operationPersistedId, OperationHash AS operationHash, ClientName AS clientName, ClientVersion AS clientVersion, IF(empty(OperationPersistedID), false, true) AS isPersisted FROM cosmo.traces WHERE (Timestamp >= toDateTime(_CAST(1748360601, 'UInt64'))) AND (Timestamp <= toDateTime(_CAST(1748447001, 'UInt64'))) AND (FederatedGraphID = '71dc7892-879e-4693-92a5-b4dedf10ee5c') AND (OrganizationID = '08a5f0a0-3b29-4d1c-8ee8-7e26f2ce1533') ORDER BY unixTimestamp DESC LIMIT _CAST(0, 'Int64'), _CAST(20, 'Int16'). Maybe you meant: ['spanId']. (UNKNOWN_IDENTIFIER) (version 24.6.2.17 (official build))\n"} {"level":"error","time":"2025-05-28T15:43:21.740Z","pid":1,"hostname":"cosmo-controlplane-67cbdc4dc-5p5lq","reqId":"req-1kt","service":"wg.cosmo.platform.v1.PlatformService","method":"GetAnalyticsView","actor":{"userId":"6ab8679d-8677-4472-bfd8-3b7dd8977510","organizationId":"08a5f0a0-3b29-4d1c-8ee8-7e26f2ce1533"}} {"level":"info","time":"2025-05-28T15:43:21.741Z","pid":1,"hostname":"cosmo-controlplane-67cbdc4dc-5p5lq","reqId":"req-1kt","res":{"statusCode":500},"responseTime":28.99932622909546,"msg":"request completed"} {"level":"info","time":"2025-05-28T15:43:21.777Z","pid":1,"hostname":"cosmo-controlplane-67cbdc4dc-5p5lq","reqId":"req-1ku","res":{"statusCode":200},"responseTime":63.87835216522217,"msg":"request completed"} {"level":"info","time":"2025-05-28T15:43:23.675Z","pid":1,"hostname":"cosmo-controlplane-67cbdc4dc-5p5lq","reqId":"req-1kv","req":{"method":"GET","url":"/health","hostname":"192.168.243.99:3001","remoteAddress":"10.31.169.57","remotePort":53362},"msg":"incoming request"} {"level":"info","time":"2025-05-28T15:43:23.676Z","pid":1,"hostname":"cos

Root Cause of the 500 Error: The control plane log shows this error when calling GetAnalyticsView:

DB::Exception: Unknown expression identifier 'SpanId' in scope ... Maybe you meant: ['spanId'].

Steps to Reproduce:

`Upgrade to Cosmo Helm Chart version v0.13.1 (same issue also observed with v0.13.0)

Log in to Cosmo Studio

Navigate to Analytics → Traces

Expected Behavior: Traces should load and be visible

Actual Behavior: Blank screen is shown with an internal error (500)`

Component Versions: COSMO_HELM_CHART_VERSION: 0.13.1 router: 0.197.1 control plane: 0.133.1 studio: 0.111.0 otel collector: 0.18.1 clickhouse: 24.6.2-debian-12-r3 graphqlmetrics: 0.33.0 minio: 2024.7.16-debian-12-r0 postgresql: 15.4.0-debian-11-r56 redis:7.2.4-debian-12-r16

pranjan-rccl avatar Jun 05 '25 08:06 pranjan-rccl

HI @pranjan-rccl thank you for additional details. We will take a look.

StarpTech avatar Jun 05 '25 09:06 StarpTech

Hello @StarpTech Any updates please?

pranjan-rccl avatar Jun 10 '25 10:06 pranjan-rccl

@StarpTech I’ve been investigating an issue where the Control Plane throws the following error: DB::Exception: Unknown expression identifier 'SpanId' in scope ... To dig deeper, I ran this query against the ClickHouse system.columns metadata: SELECT database, table, name AS column_name, type FROM system.columns WHERE name ILIKE 'SpanId'; As you can see from the attached screenshot, SpanId exists only in the cosmo.otel_traces table — not in the cosmo.traces table, which the Control Plane seems to be querying.

📌 So my suspicion (not 100% sure) is that either:

The Control Plane is using an outdated or misconfigured analytics view that assumes SpanId is present in cosmo.traces, or

There’s a mismatch between schema expectations in Studio vs. actual ClickHouse tables.

Would appreciate any insights from those familiar with the control plane logic — or if I’ve missed something obvious.

Image

pranjan-rccl avatar Jun 17 '25 12:06 pranjan-rccl

Hello @StarpTech Any updates please?

pranjan-rccl avatar Jun 30 '25 10:06 pranjan-rccl

Hello together, do we have any update on this? @StarpTech

fekaiser avatar Aug 28 '25 08:08 fekaiser

@fekaiser ,

Could you give an update on this? Is this resolved?

Slickstef11 avatar Oct 21 '25 19:10 Slickstef11

@Slickstef11 , can confirm that with this stack the error is fixed. Trace tab is working and no error in the logs.

Helm chart | 0.15.0 router | 0.259.1 control plane | 0.140.0 cdn | 0.14.5 studio | 0.129.0 otel collector | 0.20.0 graphqlmetrics | 0.36.0

Thank you!!

fekaiser avatar Oct 28 '25 11:10 fekaiser