crystal-pg icon indicating copy to clipboard operation
crystal-pg copied to clipboard

Add support for extensions like `citext`

Open jgaskins opened this issue 2 years ago • 3 comments

This PR actually adds a few different things:

  • The concept of handling extensions
  • An API to register those extensions to prepare new connections for them
  • Connection-specific decoders, delegating to the global decoder map
  • citext as one of those extensions, including a decoder for it and a PG::CIText Crystal type

The CIText type could just return a string, but I thought it might be confusing to get a case-insensitive string out of the database only for it to do case-sensitive checking later. This type offers the case-insensitive equality checks in Crystal code and also allows you to get the raw string out.

After working on this, I discovered #89 which is quite similar to this PR, but is 5 years old and likely significantly out of date by this point. This is just a draft because it’s still unpolished, but I’m using a monkeypatched version in one of my apps (email addresses are stored as citext) and it seems to work.

jgaskins avatar Apr 20 '22 03:04 jgaskins

I haven't looked into the implementation of the extension mechanism and compared it to #89.

But I noticed some issues with text representation in CIText:

  • #hash should have the same behaviour as #==, i.e. two strings that compare equally must have the same hash value. The hash needs to reflect case-insensitivity.
  • A more serious issue: There is not just a single case insensitive behaviour, but different sets of folding rules for mapping upper and lower case characters to each other. In Postgres, these rules are represented as collations. The case-insensitive behaviour of citext values is defined by the collation. But the Crystal implementation doesn't reflect that. It only uses String#compare(other, case_insensitive: true) which has a fixed, generic unicode collation (there are limited options for other folding algorithms). As a result, comparing two CIText strings in Crystal could have a different result than comparing the same two strings in the database. That would be wildly confusing. The implementation should either have identical behaviour (which is the goal of using CIText instead of String in the first place), or be clear about that it isn't identical. It would be quite complex to implement this correctly, and I don't even see a good reason for that. So I'd rather keep it dumb and make that explicit. Just using String type should be fine. It should be clear that comparing that type is case sensitive.

straight-shoota avatar Apr 20 '22 10:04 straight-shoota

#hash should have the same behaviour as #==, i.e. two strings that compare equally must have the same hash value. The hash needs to reflect case-insensitivity.

Ah, right, that's why I had the downcase in there. I knew I had a good reason for it.

It would be quite complex to implement this correctly, and I don't even see a good reason for that. So I'd rather keep it dumb and make that explicit.

I considered the differences in collation, but I've never seen anyone specify a non-default collation in 13 years of using Postgres. I assume that anyone using them understands they introduce weird inconsistencies between DB and application code in addition to complicating queries. For everyone else, it should work as expected. If collations got more usage, I'd fully agree with you that not matching the behavior exactly would likely be too misleading.

I don't really have a strong opinion either way on the CIText type, to be clear, but I added it to show that I explored the option. I use it in one of my own apps for email addresses and it does okay there. I prefer it to calling downcase everywhere I need to compare that string with another. To be fair, I could also use a converter inside my app to do this, but it's convenient not to need to.

jgaskins avatar Apr 20 '22 19:04 jgaskins

Hey, just ran in to possibly needing this... I'm wanting to use a citext[], but currently it fails on decoding. Any chance we can look at reviving it @jgaskins ?

jwoertink avatar Sep 13 '23 00:09 jwoertink