database_consistency
database_consistency copied to clipboard
citext false positive
Hey @djezzzl ! :wave:
I have the following in db/schema.rb
:
enable_extension "citext"
create_table "groups", force: :cascade do |t|
t.citext "name", null: false
t.index ["name"], name: "index_groups_on_name", unique: true
Model:
validates :name, uniqueness: { case_sensitive: false }
database_consistency
complains:
MissingUniqueIndexChecker fail Group lower(name) model should have proper unique index in the database
psql mydb
:
# \d groups
Table "public.groups"
Column | Type | Collation | Nullable | Default
--------------------+--------------------------------+-----------+----------+-------------------------------------------------
name | citext | | not null |
...
Indexes:
"index_groups_on_name" UNIQUE, btree (name)
PostgreSQL citext
extension docs: https://www.postgresql.org/docs/current/citext.html
I've debugged it, and:
sorted_uniqueness_validator_columns # => ["lower(name)"]
Helper.extract_index_columns(index.columns).sort # => ["name"]
:boom:
Does it make sense to remove/ignore lower(
from the index column if this column's type is citext
?
database_consistency
version 1.1.2
Hi @pirj ,
It makes total sense for me 👍 Would you mind to contribute?
Hi @pirj ,
I have finally a PR for this (https://github.com/djezzzl/database_consistency/pull/147) to be fixed but I tested the planner and I think we still would better have a proper (lower) index instead.
The reason is that ActiveRecord doesn't recognize citext
and apply lower
anyway:
Entity.new(email: 'TMPTMP').valid?
Entity Exists? (0.3ms) SELECT 1 AS one FROM "entities" WHERE LOWER("entities"."email") = LOWER($1) LIMIT $2 [["email", "TMPTMP"], ["LIMIT", 1]]
database_consistency_test=# \d tmp;
Table "public.tmp"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+---------
email | citext | | |
Indexes:
"tmp_idx" btree (email)
database_consistency_test=# explain analyze select * from tmp where lower(email) = lower('asdfasdf');
QUERY PLAN
----------------------------------------------------------------------------------------------------
Seq Scan on tmp (cost=0.00..2040.00 rows=500 width=9) (actual time=26.415..26.415 rows=0 loops=1)
Filter: (lower((email)::text) = 'asdfasdf'::text)
Rows Removed by Filter: 100000
Planning Time: 0.296 ms
Execution Time: 26.429 ms
(5 rows)
database_consistency_test=# explain analyze select * from tmp where email = 'asdfasdf';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Index Only Scan using tmp_idx on tmp (cost=0.42..4.44 rows=1 width=9) (actual time=0.044..0.045 rows=0 loops=1)
Index Cond: (email = 'asdfasdf'::citext)
Heap Fetches: 0
Planning Time: 0.047 ms
Execution Time: 0.057 ms
(5 rows)
And after adding a lower index:
database_consistency_test=# \d tmp;
Table "public.tmp"
Column | Type | Collation | Nullable | Default
--------+--------+-----------+----------+---------
email | citext | | |
Indexes:
"tmp_idx" btree (lower(email::text))
database_consistency_test=# explain analyze select * from tmp where email = 'asdfasdf';
QUERY PLAN
--------------------------------------------------------------------------------------------------
Seq Scan on tmp (cost=0.00..1790.00 rows=1 width=9) (actual time=55.766..55.767 rows=0 loops=1)
Filter: (email = 'asdfasdf'::citext)
Rows Removed by Filter: 100000
Planning Time: 0.134 ms
Execution Time: 55.777 ms
(5 rows)
database_consistency_test=# explain analyze select * from tmp where lower(email) = lower('asdfasdf');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Index Scan using tmp_idx on tmp (cost=0.42..8.44 rows=1 width=9) (actual time=0.040..0.041 rows=0 loops=1)
Index Cond: (lower((email)::text) = 'asdfasdf'::text)
Planning Time: 0.086 ms
Execution Time: 0.054 ms
(4 rows)
With that said, I will abandon the PR and close the issue.
Please feel free to reopen it.
Hey @djezzzl 👋
Thanks for giving it a shot.
There's this deficiency with a redundant lower
that needs to be fixed on the Rails side.
At this stage I don't see how database_consistency
could handle it.
https://github.com/rails/rails/pull/46568 one done, one to go
Wow! Well done! Unfortunately, my past experience contributing to ROR wasn't great so I didn't expect it's actually possible 😀
https://github.com/rails/rails/pull/46592
Wow, you do the magic! Thank you for improving Rails experience for all of us!