sql
sql copied to clipboard
[BUG] _id is not available in where condition
What is the bug?
reproduce:
POST _plugins/_sql { "query": "SELECT * FROM my-index1 where _id = 1" }
SemanticCheckException was encountered
similar one. https://github.com/opensearch-project/sql/issues/339.
related issue: https://github.com/opensearch-project/sql/issues/432
Since we're adding more Storage implementation for external data source, such as S3, Prometheus. Probably another way to look at this is: every storage may have some special/meta fields. For example, OpenSearch has _id, _score, S3 may have _partition, _file like these in Trino https://trino.io/docs/current/connector/hive.html#special-columns.
We can include metadata fields in the parser language as part of this ticket and hardcode the field types in the expression analyzer. This will also work to fix the double underscore fields.
@dai-chen unfortunately, this doesn't work for other storage devices - which means all _id fields will be treated equally. We would need to think of a way around this when defining our user-defined-languages and maybe just check the table storage type is part of a hardcoded list (for that storage type) or is defined in the user-defined-language.
Works for these examples:
SELECT calcs.key, str0, _id, _score, _maxscore FROM calcs WHERE _id="5"
>> returns calcs.key, calcs.str0, calcs._id, calcs._score, calcs._maxscore
SELECT *, _id FROM bigint WHERE _id="2"
>> returns all of the fields defined in bigint plus _id
SELECT __myCoolField FROM myindex WHERE __myCoolField="cool";
>>returns myindex.__myCoolField
PR prototype: https://github.com/Bit-Quill/opensearch-project-sql/pull/142
@acarbonetto Any progress on this? I'm thinking can we label all these related issues 2.6.0?
@dai-chen this issue can be closed as part of https://github.com/opensearch-project/sql/pull/1456