parseable icon indicating copy to clipboard operation
parseable copied to clipboard

It is not possible to query fields with capitals names

Open nitisht opened this issue 3 years ago • 1 comments

Query is case sensitive, so we need to avoid modifying query string while processing the query.

nitisht avatar Aug 19 '22 04:08 nitisht

This is still not fixed.

nitisht avatar Aug 24 '22 16:08 nitisht

Is this issue just on the frontend?

How can I replicate this bug?

123vivekr avatar Dec 11 '22 04:12 123vivekr

Not just frontend. Follow these steps to repro

  1. Setup Parseable locally.
  2. Create a stream
  3. Insert a JSON event with capital field and one small field. Eg
{
  "name": "Example event",
  "DESCRIPTION": "Optional description",
  "startDate": 1413384452,
  "endDate": 1413394452
}
  1. Try querying each field separately via the query API.

nitisht avatar Dec 11 '22 04:12 nitisht

@nitisht I was looking at this bug and found the following related issue https://github.com/apache/arrow-datafusion/issues/2374

To try and resolved this problem I see two possible fixes:

  1. At the ingestion process we force to lowercase all the keys in the json, this normalizes the query files but adds a bit more of processing over head.
  2. At the API level, if we see any fields with uppercases we will need to add double quotes on the specified fields.

Which one do you think would be the more optimal?

syepes avatar Feb 14 '23 07:02 syepes

Thank you for digging deeper in this @syepes . IMO the second approach is better - it is much less computationally intensive and will help accomodate different types of columns (e.g. containing unicode).

nitisht avatar Feb 14 '23 08:02 nitisht

@syepes From alamb's comment you can see how postgres interprets mixed case. From testing same with parseable it seems like you can query the data but it requires the field to be double quoted.

Technically this is not a bug, so we can just document this behavior properly. Trying to pursue this as a bug means breaking ingestion for fields with same name but differently cased ( which i know no one would actually do but still ).

For query if double quoting is not a common knowledge then we can tweak query statement using sql parser to always include quotes for non-lowercase column names

trueleo avatar Feb 15 '23 06:02 trueleo

Closing this, as this is intended behaviour from a SQL perspective. We'll add details on Parseable's case sensitivity in the concepts doc here : https://www.parseable.io/docs/concepts

nitisht avatar Feb 15 '23 06:02 nitisht

For query if double quoting is not a common knowledge then we can tweak query statement using sql parser to always include quotes for non-lowercase column names

I had a rapid look and it did not look like and easy task, as we would have to take care of the SELETC and WHERE fields. Something like: Convert to AST --> Double quote the .chars().any(|x| (x.is_uppercase() || !x.is_ascii()) --> Convert back to SQL String

But yes, this is definitively the expected behavior it just need to be documented :-)

syepes avatar Feb 15 '23 07:02 syepes