risinglight icon indicating copy to clipboard operation
risinglight copied to clipboard

binder: encapsulate lower-case conversion for identifiers

Open pleiadesian opened this issue 2 years ago • 8 comments

According to this comment, we should encapsulate the lower-case conversion for better extensibility. However, we can not do this by modifying the Catalog, because we have other states owning the identifiers, e.g., every field in BindContext.

I think the best way to do this is to visit the AST generated by sqlparser, and convert all the names into lower-case.

pleiadesian avatar Jan 02 '22 02:01 pleiadesian

Just convert the input string before parsing🙈

wangrunji0408 avatar Jan 02 '22 05:01 wangrunji0408

Just convert the input string before parsing🙈

But this won’t retain users’ preference when creating tables. For example, they might create column V1, but will get v1 instead.

skyzh avatar Jan 02 '22 07:01 skyzh

Just convert the input string before parsing🙈

But this won’t retain users’ preference when creating tables. For example, they might create column V1, but will get v1 instead.

This is exactly what Postgres does.🙈

pleiadesian avatar Jan 02 '22 08:01 pleiadesian

According to this comment, we should not convert input string into lower case. So, converting all identifiers in the AST seems to be the acceptable choice.

pleiadesian avatar Jan 03 '22 05:01 pleiadesian

Conceptually, traversing the AST sounds easy and intuitive. But since the AST is implemented as enum's in sqlparser, how do you traverse enums? Manual matches?

yyin-dev avatar Sep 20 '22 02:09 yyin-dev

Yes, that's true 🤣 btw I just realized that Postgres parser will convert everything to lowercase, so we might follow its convention and just convert everything to lowercase.

skyzh avatar Sep 20 '22 02:09 skyzh

so we might follow its convention and just convert everything to lowercase.

Do you mean after converting names in the AST? Converting the query string into lowercase cannot handle the following correctly:

image

yyin-dev avatar Sep 20 '22 04:09 yyin-dev

Oh, you're right. I mean all identifiers (like table name, column name) are lowercase. Probably we only need to make all table / column case-insensitive in binder.

skyzh avatar Sep 20 '22 06:09 skyzh