The SegmentMetadata query returns the thetaSketch column type incorrectly in real-time ingestion range
Environment
- Apache Druid: 26.0.0
- Kafka: 2.7.1
Description
Using Kafka ingestion and submitting the ingestion task as follows.
...
"metricsSpec": [
{
"name": "uniq_column1",
"type": "thetaSketch",
"fieldName": "uniq_column1",
"size": 16384
},
{
"name": "uniq_column1",
"type": "thetaSketch",
"fieldName": "uniq_column1",
"size": 16384
},
]
...
"tuningConfig": {
"type": "kafka",
"maxRowsPerSegment": 1000000000,
"maxTotalRows": 1000000000,
"maxBytesInMemory": -1
},
...
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "HOUR",
"queryGranularity": "SECOND",
"rollup": true
}
...
"taskDuration": "PT1H"
When use segment metadata query, thetaSketch type column return type and typeSignature as STRING type. Not the thetaSketch type.
{
queryType: "segmentMetadata",
dataSource: "datasource",
merge: true
}
| column | typeSignature | type | errorMessage |
|---|---|---|---|
| uniq_column1 | STRING | STRING | error:cannot_merge_diff_types: [thetaSketch] and [thetaSketchBuild] |
| uniq_column2 | STRING | STRING | error:cannot_merge_diff_types: [thetaSketch] and [thetaSketchBuild] |
But, when I set the range of the segment metadata query to exclude the real-time ingestion range, it returns the correct type.
{
queryType: "segmentMetadata",
dataSource: "datasource",
merge: true,
intervals:["2024-08-30T04:00:00.000Z/2024-09-01T23:00:00.000Z"]
}
| column | typeSignature | type | errorMessage |
|---|---|---|---|
| uniq_column1 | COMPLEX<thetaSketch> | thetaSketch | null |
| uniq_column2 | COMPLEX<thetaSketch> | thetaSketch | null |
I'm also using version 0.21.0 of the Druid cluster, and when I test the same type of query, it returns the correct type.
{
queryType: "segmentMetadata",
dataSource: "datasource",
merge: true
}
| column | type | errorMessage |
|---|---|---|
| uniq_column1 | thetaSketch | null |
| uniq_column2 | thetaSketch | null |
It seems particularly unable to merge in the real-time ingestion range for thetaSketch type. This kind of issue already fixed in https://github.com/apache/druid/issues/3339, but still affected in version 26.0.0.
Is there a solution for this, or has it been fixed in a newer version of the Druid cluster?
@findingrish Is this something you can take a look into ?
Test with druid 30.0.0, but still have an issue
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.