parseable
parseable copied to clipboard
It is not possible to query fields with capitals names
Query is case sensitive, so we need to avoid modifying query string while processing the query.
This is still not fixed.
Is this issue just on the frontend?
How can I replicate this bug?
Not just frontend. Follow these steps to repro
- Setup Parseable locally.
- Create a stream
- Insert a JSON event with capital field and one small field. Eg
{
"name": "Example event",
"DESCRIPTION": "Optional description",
"startDate": 1413384452,
"endDate": 1413394452
}
- Try querying each field separately via the query API.
@nitisht I was looking at this bug and found the following related issue https://github.com/apache/arrow-datafusion/issues/2374
To try and resolved this problem I see two possible fixes:
- At the ingestion process we force to lowercase all the keys in the json, this normalizes the query files but adds a bit more of processing over head.
- At the API level, if we see any fields with uppercases we will need to add double quotes on the specified fields.
Which one do you think would be the more optimal?
Thank you for digging deeper in this @syepes . IMO the second approach is better - it is much less computationally intensive and will help accomodate different types of columns (e.g. containing unicode).
@syepes From alamb's comment you can see how postgres interprets mixed case. From testing same with parseable it seems like you can query the data but it requires the field to be double quoted.
Technically this is not a bug, so we can just document this behavior properly. Trying to pursue this as a bug means breaking ingestion for fields with same name but differently cased ( which i know no one would actually do but still ).
For query if double quoting is not a common knowledge then we can tweak query statement using sql parser to always include quotes for non-lowercase column names
Closing this, as this is intended behaviour from a SQL perspective. We'll add details on Parseable's case sensitivity in the concepts doc here : https://www.parseable.io/docs/concepts
For query if double quoting is not a common knowledge then we can tweak query statement using sql parser to always include quotes for non-lowercase column names
I had a rapid look and it did not look like and easy task, as we would have to take care of the SELETC and WHERE fields. Something like: Convert to AST --> Double quote the .chars().any(|x| (x.is_uppercase() || !x.is_ascii()) --> Convert back to SQL String
But yes, this is definitively the expected behavior it just need to be documented :-)