[BUG] Deprecation of OpenSearch DSL format has affected the behavior of the explain API
What is the bug? Deprecation of OpenSearch DSL format has affected the behavior of the explain API https://github.com/opensearch-project/sql/issues/3280
Prior to OpenSearch 3.0 we could request DSL syntax using the following query:
POST /_plugins/_sql/_explain?format=json
{"query":"SELECT Shape, _id FROM railroads LIMIT 0,10000"}
result:
{
"from": 0,
"size": 10000,
"_source": {
"includes": [
"Shape",
"_id"
],
"excludes": []
}
}
With 3.0 we have noticed a couple of issues with our workflow.
- using
format=jsonreturns an error when used with explain API
{
"error": {
"reason": "Invalid SQL query",
"details": "Failed to create executor due to unknown response format: json",
"type": "IllegalArgumentException"
},
"status": 400
}
- if we dont use
format=jsonthen we run into issues with including a geoshape column
POST /_plugins/_sql/_explain
{"query":"SELECT Shape, _id FROM railroads LIMIT 0,10000"}
{
"error": {
"reason": "Invalid SQL query",
"details": "can't resolve Symbol(namespace=FIELD_NAME, name=Shape) in type env",
"type": "SemanticCheckException"
},
"status": 400
}
Hi @chet5619 , thanks for filing this issue.
Regarding to the format=json issue in _explain API
using format=json returns an error when used with explain API
{
"error": {
"reason": "Invalid SQL query",
"details": "Failed to create executor due to unknown response format: json",
"type": "IllegalArgumentException"
},
"status": 400
}
The behavior you're seeing is expected in OpenSearch 3.0. The format=json parameter was specifically for the query execution endpoint (/_plugins/_sql), not for the explain API. The explain API has always returned the DSL query plan without requiring a format parameter. Could you help us understand your use case better?
Regarding to the GeoShape field causing SemanticCheckException
This appears to be a separate bug. The error suggests that the SQL engine cannot resolve the Shape field (which is a geo_shape type). This is likely because:
- GeoShape fields may not be fully supported in SQL SELECT statements
- There may be a regression in field type resolution
Can you provide:
- The mapping of your
railroadsindex (specifically theShapefield definition)? (You can sanitize off some of the sensitive fields) - The OpenSearch version you upgraded from? Or like was this issue exist in 2.x?
- Does the query work without the Shape field (e.g.,
SELECT _id FROM railroads LIMIT 10000)?
Regarding to the Explain API Deprecation
And yes in general, we do have a plan for do the enhancement of the explain API reference to issue https://github.com/opensearch-project/sql/issues/4351. However, that issue is about enhancing the API, not deprecating the DSL format. The DSL format is still the primary output of the explain API.
The behavior you're seeing is expected in OpenSearch 3.0. The format=json parameter was specifically for the query execution endpoint (/_plugins/_sql), not for the explain API. The explain API has always returned the DSL query plan without requiring a format parameter. Could you help us understand your use case better?
as i explained in the original comment the result of the following query:
POST /_plugins/_sql/_explain?format=json
{"query":"SELECT * FROM railroads LIMIT 0,10000"}
...was this - only the transalted DSL query:
{
"from": 0,
"size": 10000,
"_source": {
"includes": [ <array of fields>],
"excludes": []
}
}
but now since we cannot use the format=json we get a result like this:
{
"root": {
"name": "ProjectOperator",
"description": {
"fields": "[<array of fields>]"
},
"children": [
{
"name": "OpenSearchIndexScan",
"description": {
"request": "OpenSearchQueryRequest(indexName=railroads, sourceBuilder={\"from\":0,\"size\":10000,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"array\",\"of\",\"fields\"],\"excludes\":[]}}, needClean=true, searchDone=false, pitId=null, cursorKeepAlive=null, searchAfter=null, searchResponse=null)"
},
"children": []
}
]
}
}
It still includes the json DSL, but we now have to parse the information from the result.
Can you provide:
- The mapping of your
railroadsindex (specifically theShapefield definition)? (You can sanitize off some of the sensitive fields)
"mappings": {
"properties": {
"OBJECTID": {
"type": "long"
},
"Shape": {
"type": "geo_shape"
},
"Shape_Length": {
"type": "double"
},
"name": {
"type": "keyword"
}
}
}
- The OpenSearch version you upgraded from? Or like was this issue exist in 2.x?
The issue doesnt exist for OpenSearch 2.9.x
- Does the query work without the Shape field (e.g.,
SELECT _id FROM railroads LIMIT 10000)?
Yes the query works if the geo_shape column is not included... and the query works if usig a geo_point mapping instead of geo_shape.
Recently tested OpenSearch 3.4 and we are still unable to use any 3.x release for our application. Issues that we are experiencing:
- JSON format is no longer supported for /_plugins/_sql/_explain
this query:
returns an errorPOST /_plugins/_sql/_explain?format=json {"query":"SELECT * FROM railroads LIMIT 0,10000"}
used to return before 3.x:{ "error": { "reason": "Invalid SQL query", "details": "Failed to create executor due to unknown response format: json", "type": "IllegalArgumentException" }, "status": 400 }
without{ "from": 0, "size": 10000 }format=jsonwe get a result that must be parsed before we can use it{ "root": { "name": "ProjectOperator", "description": { "fields": "[OBJECTID, name, Shape_Length]" }, "children": [ { "name": "OpenSearchIndexScan", "description": { "request": "OpenSearchQueryRequest(indexName=railroads, sourceBuilder={\"from\":0,\"size\":10000,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"Shape_Length\",\"OBJECTID\",\"name\"],\"excludes\":[]}}, searchDone=false)" }, "children": [] } ] } } - geopoint fields are supported with
/_plugins/_sql/_explainbut geoshape columns are not. our application uses/_plugins/_sql/_explain?format=jsonto retrieve the DSL query and it would work for both geoshape and geojson, but now sinceformat=jsonis gone, the_explainfails on geoshape fields
Until these issues can be resolved our application will be unable to support opensearch instances 3.x or higher