hyperdx icon indicating copy to clipboard operation
hyperdx copied to clipboard

Logs: "Surrounding Context" results in error

Open mscdex opened this issue 6 months ago • 8 comments

ClickHouse: 25.5.4.38 HyperDX: 2.0.2

The "Surrounding Context" tab for a log entry results in a message:

Error loading results, please check your query or try again.

Value nan cannot be parsed as Int64 for query parameter 'HYPERDX_PARAM_78043' because it isn't parsed completely: only 0 of 3 bytes was parsed: .

mscdex avatar Jun 27 '25 01:06 mscdex

@mscdex can you share if this is using the default OTel schema or a custom schema? Or the generated HTTP request that failed? Just trying to figure out how to repro this bug.

MikeShi42 avatar Jun 29 '25 00:06 MikeShi42

Custom schema with only a few log table fields configured.

The message is displayed on the page itself, not the browser console. I didn't look to see if it was due to a failed HTTP request or not.

mscdex avatar Jun 29 '25 03:06 mscdex

I also experience this issue. I'm using the default OTel schema with JSON support enabled.

hiasr avatar Aug 20 '25 15:08 hiasr

I'm still not able to repro the issue @hiasr is this a persistent issue? Do you have reproduction steps by any chance or which page/stack trace which occurs? (Especially if it repros in play.hyperdx.io)?

MikeShi42 avatar Aug 20 '25 17:08 MikeShi42

@MikeShi42 I hit this same error just now while following the Lab that was included in Observability at Scale with ClickStack I was following the instructions for Lab 1.1 Getting Started with ClickStack, step 12. This is on a M4 MacBook Pro using the Docker image from the lab.

Here are several of the messages, in case the extra context helps.

2025.08.29 14:02:21.588056 [ 961 ] {a6c8c48d-4762-4ece-bf8b-0551247efcc0} <Error> executeQuery: Code: 457. DB::Exception: Value nan cannot be parsed as Int64 for query parameter 'HYPERDX_PARAM_78043' because it isn't parsed completely: only 0 of 3 bytes was parsed: . (BAD_QUERY_PARAMETER) (version 25.6.8.10 (official build)) (from 127.0.0.1:46554) (in query: SELECT Timestamp AS "Timestamp",ResourceAttributes['rum.sessionId'] AS "rumSessionId",ResourceAttributes['service.name'] AS "serviceName",ParentSpanId AS "parentSpanId" FROM {HYPERDX_PARAM_1544803905:Identifier}.{HYPERDX_PARAM_4567585:Identifier} WHERE (Timestamp >= fromUnixTimestamp64Milli({HYPERDX_PARAM_78043:Int64}) AND Timestamp <= fromUnixTimestamp64Milli({HYPERDX_PARAM_78043:Int64})) AND (TraceId = 'undefined') LIMIT {HYPERDX_PARAM_46730161:Int32} OFFSET {HYPERDX_PARAM_48:Int32} FORMAT JSONCompactEachRowWithNamesAndTypes), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000dadff48
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x0000000009121a1c
2. DB::Exception::Exception<String const&, String const&, String const&, unsigned long, unsigned long, String>(int, FormatStringHelperImpl<std::type_identity<String const&>::type, std::type_identity<String const&>::type, std::type_identity<String const&>::type, std::type_identity<unsigned long>::type, std::type_identity<unsigned long>::type, std::type_identity<String>::type>, String const&, String const&, String const&, unsigned long&&, unsigned long&&, String&&) @ 0x00000000118f1b54
3. DB::ReplaceQueryParameterVisitor::visitQueryParameter(std::shared_ptr<DB::IAST>&) @ 0x00000000118f0e0c
4. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
5. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
6. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
7. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
8. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
9. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
10. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
11. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
12. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
13. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
14. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
15. DB::executeQueryImpl(char const*, char const*, std::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, DB::ReadBuffer*, std::shared_ptr<DB::IAST>&) @ 0x0000000011903488
16. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::shared_ptr<DB::Context>, std::function<void (DB::QueryResultDetails const&)>, DB::QueryFlags, std::optional<DB::FormatSettings> const&, std::function<void (DB::IOutputFormat&, String const&, std::shared_ptr<DB::Context const> const&, std::optional<DB::FormatSettings> const&)>, std::function<void ()>) @ 0x000000001190b508
17. DB::HTTPHandler::processQuery(DB::HTTPServerRequest&, DB::HTMLForm&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&, std::optional<DB::CurrentThread::QueryScope>&, StrongTypedef<unsigned long, ProfileEvents::EventTag> const&) @ 0x00000000129f935c
18. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&, StrongTypedef<unsigned long, ProfileEvents::EventTag> const&) @ 0x00000000129fc320
19. DB::HTTPServerConnection::run() @ 0x0000000012aa0c80
20. Poco::Net::TCPServerConnection::start() @ 0x0000000015abaeb8
21. Poco::Net::TCPServerDispatcher::run() @ 0x0000000015abb3d4
22. Poco::PooledThread::run() @ 0x0000000015a854d8
23. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000015a838b0
24. ? @ 0x000000000007d5b8
25. ? @ 0x00000000000e5edc

2025.08.29 14:02:21.588419 [ 961 ] {a6c8c48d-4762-4ece-bf8b-0551247efcc0} <Error> DynamicQueryHandler: Code: 457. DB::Exception: Value nan cannot be parsed as Int64 for query parameter 'HYPERDX_PARAM_78043' because it isn't parsed completely: only 0 of 3 bytes was parsed: . (BAD_QUERY_PARAMETER), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000dadff48
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x0000000009121a1c
2. DB::Exception::Exception<String const&, String const&, String const&, unsigned long, unsigned long, String>(int, FormatStringHelperImpl<std::type_identity<String const&>::type, std::type_identity<String const&>::type, std::type_identity<String const&>::type, std::type_identity<unsigned long>::type, std::type_identity<unsigned long>::type, std::type_identity<String>::type>, String const&, String const&, String const&, unsigned long&&, unsigned long&&, String&&) @ 0x00000000118f1b54
3. DB::ReplaceQueryParameterVisitor::visitQueryParameter(std::shared_ptr<DB::IAST>&) @ 0x00000000118f0e0c
4. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
5. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
6. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
7. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
8. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
9. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
10. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
11. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
12. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
13. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
14. DB::ReplaceQueryParameterVisitor::visitChildren(std::shared_ptr<DB::IAST>&) @ 0x00000000118f1394
15. DB::executeQueryImpl(char const*, char const*, std::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, DB::ReadBuffer*, std::shared_ptr<DB::IAST>&) @ 0x0000000011903488
16. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::shared_ptr<DB::Context>, std::function<void (DB::QueryResultDetails const&)>, DB::QueryFlags, std::optional<DB::FormatSettings> const&, std::function<void (DB::IOutputFormat&, String const&, std::shared_ptr<DB::Context const> const&, std::optional<DB::FormatSettings> const&)>, std::function<void ()>) @ 0x000000001190b508
17. DB::HTTPHandler::processQuery(DB::HTTPServerRequest&, DB::HTMLForm&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&, std::optional<DB::CurrentThread::QueryScope>&, StrongTypedef<unsigned long, ProfileEvents::EventTag> const&) @ 0x00000000129f935c
18. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&, StrongTypedef<unsigned long, ProfileEvents::EventTag> const&) @ 0x00000000129fc320
19. DB::HTTPServerConnection::run() @ 0x0000000012aa0c80
20. Poco::Net::TCPServerConnection::start() @ 0x0000000015abaeb8
21. Poco::Net::TCPServerDispatcher::run() @ 0x0000000015abb3d4
22. Poco::PooledThread::run() @ 0x0000000015a854d8
23. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000015a838b0
24. ? @ 0x000000000007d5b8
25. ? @ 0x00000000000e5edc
 (version 25.6.8.10 (official build))
2025.08.29 14:02:27.066995 [ 1088 ] {} <Fatal> BaseDaemon: ########## Short fault info ############
2025.08.29 14:02:27.067038 [ 1088 ] {} <Fatal> BaseDaemon: (version 25.6.8.10 (official build), build id: 82272E0517611608517187046264563BEB4EE2CE, git hash: c54ce812cd4fe54c653d8cbf17b7ece6f4482deb, architecture: aarch64) (from thread 908) Received signal 4
2025.08.29 14:02:27.067046 [ 1088 ] {} <Fatal> BaseDaemon: Signal description: Illegal instruction
2025.08.29 14:02:27.067050 [ 1088 ] {} <Fatal> BaseDaemon: Illegal opcode.
2025.08.29 14:02:27.067055 [ 1088 ] {} <Fatal> BaseDaemon: Stack trace: 0x000000000de4cc14 0x0000ffff867ad7a0
2025.08.29 14:02:27.067063 [ 1088 ] {} <Fatal> BaseDaemon: ########################################
2025.08.29 14:02:27.067081 [ 1088 ] {} <Fatal> BaseDaemon: (version 25.6.8.10 (official build), build id: 82272E0517611608517187046264563BEB4EE2CE, git hash: c54ce812cd4fe54c653d8cbf17b7ece6f4482deb) (from thread 908) (query_id: cc1c07ab-0fb4-4c11-bc81-b80e8987a56b) (query: SELECT SpanAttributes['message'] AS body, SpanAttributes['component'] AS component, (toFloat64OrZero(toString(Duration)) * pow(10, 3)) / pow(10, toInt8OrZero(toString(9))) AS durationInMs, SpanAttributes['error.message'] AS `error.message`, SpanAttributes['http.method'] AS `http.method`, SpanAttributes['http.status_code'] AS `http.status_code`, SpanAttributes['http.url'] AS `http.url`, cityHash64(TraceId, ParentSpanId, SpanId, Timestamp) AS id, SpanAttributes['location.href'] AS `location.href`, ScopeName AS `otel.library.name`, ParentSpanId AS parent_span_id, StatusCode AS severity_text, SpanId AS span_id, SpanName AS span_name, Timestamp AS timestamp, TraceId AS trace_id, CAST('span', 'String') AS type FROM default.otel_traces WHERE ((Timestamp >= fromUnixTimestamp64Milli(_CAST(1756461273405, 'Int64'))) AND (Timestamp <= fromUnixTimestamp64Milli(_CAST(1756490179245, 'Int64')))) AND (((ResourceAttributes['rum.sessionId']) = '54f68bbcedc2cf5181a16a9191b30327') AND (((SpanAttributes['http.status_code']) > '299') OR ((SpanAttributes['component']) = 'error') OR (SpanName = 'routeChange') OR (SpanName = 'documentLoad') OR (SpanName = 'intercom.onShow') OR (ScopeName = 'custom-action'))) ORDER BY Timestamp ASC LIMIT _CAST(0, 'Int32'), _CAST(4000, 'Int32') FORMAT JSON) Received signal Illegal instruction (4)
2025.08.29 14:02:27.067087 [ 1088 ] {} <Fatal> BaseDaemon: Illegal opcode.
2025.08.29 14:02:27.067097 [ 1088 ] {} <Fatal> BaseDaemon: Stack trace: 0x000000000de4cc14 0x0000ffff867ad7a0
2025.08.29 14:02:27.067174 [ 1088 ] {} <Fatal> BaseDaemon: 0. signalHandler(int, siginfo_t*, void*) @ 0x000000000de4cc14
2025.08.29 14:02:27.067184 [ 1088 ] {} <Fatal> BaseDaemon: 1. ? @ 0x0000ffff867ad7a0
2025.08.29 14:02:27.067192 [ 1088 ] {} <Fatal> BaseDaemon: Integrity check of the executable skipped because the reference checksum could not be read.
2025.08.29 14:02:27.067336 [ 1088 ] {} <Fatal> BaseDaemon: Report this error to https://github.com/ClickHouse/ClickHouse/issues
2025.08.29 14:02:27.067500 [ 1088 ] {} <Fatal> BaseDaemon: Changed settings: use_uncompressed_cache = false, load_balancing = 'in_order', log_queries = true, max_memory_usage = 10000000000, cancel_http_readonly_queries_on_client_close = true, parallel_replicas_for_cluster_engines = false, date_time_output_format = 'iso'

toddyocum avatar Aug 29 '25 14:08 toddyocum

Reproduction steps(ish)

Client Sessions -> Select [email protected] Session -> Select 500 Post Error -> Select the Surrounding Context tab

It sometimes works at first, I click around amongst the tabs, switched to other steps in the session replay and back, and then it errored.

docker restart of the container gets things running again, but can still be triggered.

I haven't been able to figure out a pattern of interactions for consistent repeatability.

toddyocum avatar Aug 29 '25 14:08 toddyocum

I am seeing a similar error and it appears to be because the query is sent with a parameter set as {HYPERDX_PARAM_78043:Int64}, but the query parameters sent in the HTTP request set HYPERDX_PARAM_78043=nan which is not a valid Int64 (it is a valid float)

hiporox avatar Sep 24 '25 21:09 hiporox

The root cause of this issue is that the preliminary query statement uses cityHash64(xxx, xxx) as id.

SELECT
    SpanAttributes['message']                                                             AS "body",
    SpanAttributes['component']                                                           AS "component",
    toFloat64OrZero(toString(Duration)) * pow(10, 3) / pow(10, toInt8OrZero(toString(9))) AS "durationInMs",
    SpanAttributes['error.message']                                                       AS "error.message",
    SpanAttributes['http.method']                                                         AS "http.method",
    SpanAttributes['http.status_code']                                                    AS "http.status_code",
    SpanAttributes['http.url']                                                            AS "http.url",
    cityHash64(TraceId, ParentSpanId, SpanId, Timestamp)                                  AS "id",
    SpanAttributes['location.href']                                                       AS "location.href",
    ScopeName                                                                             AS "otel.library.name",
    ParentSpanId                                                                          AS "parent_span_id",
    StatusCode                                                                            AS "severity_text",
    SpanId                                                                                AS "span_id",
    SpanName                                                                              AS "span_name",
    Timestamp                                                                             AS "timestamp",
    TraceId                                                                               AS "trace_id",
    CAST('span', 'String')                                                                AS "type"
FROM
    default.otel_traces
WHERE (Timestamp >= fromUnixTimestamp64Milli({HYPERDX_PARAM_1270862104:Int64})
  AND Timestamp <= fromUnixTimestamp64Milli({HYPERDX_PARAM_1701414642:Int64}))
-- ......

The page list then uses this id as the rowId, which is passed into a session query statement similar to the following SQL:

SELECT *,
       Timestamp          AS "__hdx_timestamp",
       SpanName           AS "__hdx_body",
       TraceId            AS "__hdx_trace_id",
       SpanId             AS "__hdx_span_id",
       StatusCode         AS "__hdx_severity_text",
       ServiceName        AS "__hdx_service_name",
       ResourceAttributes AS "__hdx_resource_attributes",
       SpanAttributes     AS "__hdx_event_attributes",
       cityHash64(TraceId , ParentSpanId , SpanId , Timestamp)
FROM
    default.otel_traces
WHERE (SpanAttributes['message'] = ''
  AND SpanAttributes['component'] = 'document-load'
  AND cityHash64(TraceId , ParentSpanId , SpanId , Timestamp) = 13326956229374271000 -- wrong, expect 13326956229374271831
  AND StatusCode = 'Unset'
  AND SpanId = '692aac963b64a150'
  AND SpanName = 'documentLoad'
  AND Timestamp = parseDateTime64BestEffort('2025-10-23T08:16:37.358800098Z' , 9)
  AND TraceId = '5a0c6954d0d94d32484047781a9535aa'
  AND CAST('span' , 'String') = 'span')

The system is supposed to use the result to further determine the context.

The problem lies in the fact that cityHash64 returns an $\text{Int64}$ value. JSON, however, typically only supports $\text{Int53}$ and cannot correctly parse a full $\text{Int64}$, leading to a loss of precision, which causes the subsequent query to fail to hit the correct record.

zzmark avatar Oct 24 '25 06:10 zzmark