hyperdx icon indicating copy to clipboard operation
hyperdx copied to clipboard

UI generates invalid ClickHouse queries when filtering on parsed JSON strings

Open garysassano opened this issue 6 months ago • 2 comments

HyperDX UI automatically parses JSON strings in the StatusMessage column but fails to generate proper ClickHouse queries when users attempt to filter on nested JSON fields through the Search for this value only feature.

Steps to Reproduce

  1. Have a span with a StatusMessage field (stored as a String in ClickHouse) containing JSON data.

  2. In the HyperDX UI, the string is automatically parsed as JSON and displayed in a tree structure:

    Image
  3. Navigate to the nested field (e.g., failures.0.index with value "identities")

  4. Click Search for this value only:

    Image

Expected Behavior

The UI should generate a valid ClickHouse query using proper JSON Functions such as:

  • simpleJSONExtractString(StatusMessage, 'index') = 'identities' for field matching

    Image

Actual Behavior

HyperDX generates an invalid query that fails with a ClickHouse syntax error. The UI knows how to parse and display the JSON structure correctly but cannot translate user filtering actions into proper ClickHouse queries.

Image

Suggested Fix

The UI should:

  1. Detect when a field is a JSON string that has been automatically parsed
  2. Prefer simpleJSONExtractString for basic field matching as it's more reliable for nested JSON
  3. Fall back to JSONExtractString only when precise path specification is required

Impact

This significantly impacts usability as users cannot effectively filter on JSON data through the UI, forcing them to write raw ClickHouse queries manually.

garysassano avatar Jul 12 '25 17:07 garysassano

Thanks for the report! I've logged an issue with the team. Reference: HDX-2017

teeohhem avatar Jul 15 '25 14:07 teeohhem

Seems like this was fixed by https://github.com/hyperdxio/hyperdx/pull/1213, but it still is broken for nested JSON inside JSON columns.

To reproduce

  1. Enable self ingestion on the hyperdx instance and search for StatusCode = 'Error'
  2. Find a JSONColumn that has nested JSON (as string). I picked SpanAttributes that has db.statement containing JSON in it.
  3. Notice that the query sent contains JSONExtractString(db.statement, 'create') and completely omits the actual column name.

Video:

https://github.com/user-attachments/assets/7bfbb1fe-b9d1-4eb1-889e-db7d55d24527

To fix

The below patch seems to fix the issue for all such JSON columns. The query is now correctly sent as JSONExtractString(SpanAttributes['db.statement'], 'create')

diff --git a/packages/app/src/components/DBRowJsonViewer.tsx b/packages/app/src/components/DBRowJsonViewer.tsx
index ed71b9a..53b280d 100644
--- a/packages/app/src/components/DBRowJsonViewer.tsx
+++ b/packages/app/src/components/DBRowJsonViewer.tsx
@@ -29,7 +29,10 @@ function buildJSONExtractStringQuery(
     return null; // No nested path to extract
   }
 
-  const baseColumn = parsedJsonRootPath[parsedJsonRootPath.length - 1];
+  const baseColumn = parsedJsonRootPath
+    .map((p, index) => (!index ? p : `['${p}']`))
+    .join('');
+
   const jsonPathArgs = nestedPath.map(p => `'${p}'`).join(', ');
   return `JSONExtractString(${baseColumn}, ${jsonPathArgs})`;
 }

Video:

https://github.com/user-attachments/assets/17c81381-00b0-4fef-b176-c20414a21c4b

fleon avatar Oct 20 '25 22:10 fleon