[FEATURE] Make SQL align with PPL to support evaluate nested fields and update fields.
Is your feature request related to a problem? Currently in OpenSearch, SQL doesn't support update or evaluate nested fields like PPL. To make them align, we may need to make SQL support these features as well.
e.g.
- Update fields PPL supports updating the fields while SQL produces a new field with the same field_name which may leads to error.
# PPL
POST _plugins/_ppl/_explain
{
"query": """
source=opensearch_dashboards_sample_data_flights | eval FlightTimeMin = 1
"""
}
# response
{
"root": {
"name": "ProjectOperator",
"description": {
"fields": "[FlightNum, Origin, OriginLocation, DestLocation, FlightDelay, DistanceMiles, FlightTimeMin, OriginWeather, dayOfWeek, AvgTicketPrice, Carrier, FlightDelayMin, OriginRegion, DestAirportID, FlightDelayType, timestamp, Dest, FlightTimeHour, Cancelled, DistanceKilometers, OriginCityName, DestWeather, OriginCountry, DestCountry, DestRegion, DestCityName, OriginAirportID]"
},
"children": [
{
"name": "EvalOperator",
"description": {
"expressions": {
"FlightTimeMin": "1"
}
},
"children": [
{
"name": "OpenSearchIndexScan",
"description": {
"request": """OpenSearchQueryRequest(indexName=opensearch_dashboards_sample_data_flights, sourceBuilder={"from":0,"size":10000,"timeout":"1m"}, searchDone=false)"""
},
"children": []
}
]
}
]
}
}
# SQL
POST _plugins/_sql
{
"query": "SELECT *, 1 as FlightTimeMin FROM opensearch_dashboards_sample_data_flights LIMIT 5"
}
# response error
"{\n \"error\": {\n \"reason\": \"Invalid SQL query\",\n \"details\": \"Multiple entries with same key: FlightTimeMin=1 and FlightTimeMin=1030.7704\",\n \"type\": \"IllegalArgumentException\"\n },\n \"status\": 400\n}"
- Evaluate nested fields PPL supports evaluating fields from left to right while SQL cannot resolve the reference.
# PPL
POST _plugins/_ppl/_explain
{
"query": """
source=opensearch_dashboards_sample_data_flights | eval FlightTimeMinPlusOne = FlightTimeMin + 1, FlightTimeMinPlusTwo = FlightTimeMinPlusOne + 1 | fields FlightTimeMinPlusOne, FlightTimeMinPlusTwo
"""
}
# response
{
"root": {
"name": "ProjectOperator",
"description": {
"fields": "[FlightTimeMinPlusOne, FlightTimeMinPlusTwo]"
},
"children": [
{
"name": "EvalOperator",
"description": {
"expressions": {
"FlightTimeMinPlusTwo": "+(FlightTimeMinPlusOne, 1)",
"FlightTimeMinPlusOne": "+(FlightTimeMin, 1)"
}
},
"children": [
{
"name": "OpenSearchIndexScan",
"description": {
"request": """OpenSearchQueryRequest(indexName=opensearch_dashboards_sample_data_flights, sourceBuilder={"from":0,"size":10000,"timeout":"1m"}, searchDone=false)"""
},
"children": []
}
]
}
]
}
}
# SQL
POST _plugins/_sql/_explain
{
"query": "SELECT (FlightTimeMin + 1) as FlightTimeMinPlusOne, (FlightTimeMinPlusOne + 1) as FlightTimeMinPlusTwo FROM opensearch_dashboards_sample_data_flights LIMIT 5"
}
# response error
"{\n \"error\": {\n \"reason\": \"Invalid SQL query\",\n \"details\": \"can't resolve Symbol(namespace=FIELD_NAME, name=FlightTimeMinPlusOne) in type env\",\n \"type\": \"SemanticCheckException\"\n },\n \"status\": 400\n}"
What solution would you like? These 2 features are all supported in EvalOperator. But when compiling SQL, it won't use EvalOperator now but only ProjectOperator, which contributes to the differences above.
To make them align, there could be 2 optional solutions:
- Make SQL generates EvalOperator as well like PPL
- Enhance ProjectOperator to support these 2 features.
What alternatives have you considered? A clear and concise description of any alternative solutions or features you've considered.
Do you have any additional context? Add any other context or screenshots about the feature request here.
[Weekly Catch All Triage - 1]
@qianheng-aws are you working of this feature ?