sql icon indicating copy to clipboard operation
sql copied to clipboard

[BUG] COUNT with aggregation return multiple rows with null value

Open penghuo opened this issue 1 year ago • 2 comments

What is the bug? curl --request POST
--url http://localhost:9200/_plugins/_sql?format=jdbc
--header 'content-type: application/json'
--data '{"query": "select count(*) from test00001 where status != 200"}'

the result include multiple rows

{
  "schema": [
    {
      "name": "count(*)",
      "type": "integer"
    }
  ],
  "datarows": [
    [
      10
    ],
    [
      null
    ],
    [
      null
    ],
    [
      null
    ],
    [
      null
    ],
    [
      null
    ],
    [
      null
    ],
    [
      null
    ],
    [
      null
    ],
    [
      null
    ]
  ],
  "total": 10,
  "size": 10,
  "

How can one reproduce the bug? Todo

What is the expected behavior? The result only include 1 rows

{
  "schema": [
    {
      "name": "count(*)",
      "type": "integer"
    }
  ],
  "datarows": [
    [
      10
    ]
  ],
  "total": 1,
  "size": 1,
  "

What is your host/environment?

  • OS: 2.11
  • Version [e.g. 22]
  • Plugins: SQL

Do you have any screenshots? If applicable, add screenshots to help explain your problem.

Do you have any additional context? Add any other context about the problem.

penghuo avatar Jun 05 '24 22:06 penghuo

Not able to repro this issue in main branch for field type Integer on JDBC response format

 curl -X POST "http://localhost:9200/_plugins/_sql?format=jdbc" -H "Content-Type: application/json" -d '{
>   "query": "SELECT count(*) FROM test100 where status != 200"}'
{
  "schema": [
    {
      "name": "count(*)",
      "type": "integer"
    }
  ],
  "datarows": [
    [
      1
    ]
  ],
  "total": 1,
  "size": 1,
  "status": 200

But I see if we use a field type text and use the count(*) with != operator the below error -

curl -X POST "http://localhost:9200/_plugins/_sql?format=jdbc" -H "Content-Type: application/json" -d '{
>   "query": "SELECT count(*) FROM test100 where ImageType != \"machine\""
> }'
{
  "error": {
    "reason": "Error occurred in OpenSearch engine: all shards failed",
    "details": "Shard[0]: java.lang.IllegalArgumentException: Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [ImageType] in order to load field data by uninverting the inverted index. Note that this can use significant memory.\n\nFor more details, please send request for Json format to see the raw response from OpenSearch engine.",
    "type": "SearchPhaseExecutionException"
  },
  "status": 400

In JSON format

 curl -X POST "http://localhost:9200/_plugins/_sql?format=json" -H "Content-Type: application/json" -d '{
>   "query": "SELECT count(*) FROM test100 where ImageType != \"machine\""
> }'
{
  "error": {
    "reason": "Invalid SQL query",
    "details": "Unsupported operation: !=",
    "type": "SqlParseException"
  },
  "status": 400
}

So it appears is users are using the count(*) aggregation query with negation operator on text fields, the error is expected as it is not recommended to use the aggregation on text fields. But for Integer fields, issue cannot be reproduced.

manasvinibs avatar Jun 25 '24 00:06 manasvinibs

Not able to repro for field type "keyword" on latest branch

curl -X GET "http://localhost:9200/test200/_mapping"
{"test200":{"mappings":{"properties":{"ImageType":{"type":"keyword"}}}}} 
 curl -X POST "http://localhost:9200/_plugins/_sql?format=jdbc" -H "Content-Type: application/json" -d '{
>   "query": "SELECT count(*) FROM test200 where ImageType != \"machine\""}'
{
  "schema": [
    {
      "name": "count(*)",
      "type": "integer"
    }
  ],
  "datarows": [
    [
      1
    ]
  ],
  "total": 1,
  "size": 1,
  "status": 200
}

manasvinibs avatar Jun 25 '24 16:06 manasvinibs