sqlite_fdw
sqlite_fdw copied to clipboard
push down of `ORDER BY` on indexed float columns ends up doing a seq scan
say i have a table like this in sqlite:
sqlite> .schema states
CREATE TABLE IF NOT EXISTS "states" (
state_id INTEGER NOT NULL,
entity_id CHAR(0),
state VARCHAR(255),
attributes CHAR(0),
event_id SMALLINT,
last_changed CHAR(0),
last_changed_ts FLOAT,
last_reported_ts FLOAT,
last_updated CHAR(0),
last_updated_ts FLOAT,
old_state_id INTEGER,
attributes_id INTEGER,
context_id CHAR(0),
context_user_id CHAR(0),
context_parent_id CHAR(0),
origin_idx SMALLINT,
context_id_bin BLOB,
context_user_id_bin BLOB,
context_parent_id_bin BLOB,
metadata_id INTEGER,
PRIMARY KEY (state_id),
FOREIGN KEY(old_state_id) REFERENCES states (state_id),
FOREIGN KEY(attributes_id) REFERENCES state_attributes (attributes_id),
FOREIGN KEY(metadata_id) REFERENCES states_meta (metadata_id)
);
CREATE INDEX ix_states_context_id_bin ON states (context_id_bin);
CREATE INDEX ix_states_metadata_id_last_updated_ts ON states (metadata_id, last_updated_ts);
CREATE INDEX ix_states_attributes_id ON states (attributes_id);
CREATE INDEX ix_states_last_updated_ts ON states (last_updated_ts);
CREATE INDEX ix_states_old_state_id ON states (old_state_id);
and I have the following foreign table in pg for it:
hass=# \d states
Foreign table "public.states"
Column | Type | Collation | Nullable | Default | FDW options
-----------------------+------------------------+-----------+----------+---------+--------------
state_id | integer | | | | (key 'true')
metadata_id | integer | | | |
state | character varying(255) | | | |
last_changed_ts | double precision | | | |
last_updated_ts | double precision | | | |
old_state_id | integer | | | |
attributes_id | integer | | | |
context_id_bin | bytea | | | |
context_user_id_bin | bytea | | | |
context_parent_id_bin | bytea | | | |
origin_idx | integer | | | |
Server: hass
FDW options: ("table" 'states')
a query that tries to sort by the last_updated_ts column doesn't use the index in sqlite because it's accessed via sqlite_fdw_float():
hass=# explain analyze verbose select * from states order by last_updated_ts desc limit 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------
Foreign Scan on public.states (cost=1.00..1.00 rows=1 width=648) (actual time=1973.694..1973.712 rows=10 loops=1)
Output: state_id, metadata_id, state, last_changed_ts, last_updated_ts, old_state_id, attributes_id, context_id_bin, context_user_id_bin, context_parent_id_bin, origin_idx
SQLite query: SELECT `state_id`, `metadata_id`, `state`, sqlite_fdw_float(`last_changed_ts`), sqlite_fdw_float(`last_updated_ts`), `old_state_id`, `attributes_id`, `context_id_bin`, `context_user_id_bin`, `context_parent_id_bin`, `origin_idx` FROM main."states" ORDER BY sqlite_fdw_float(
`last_updated_ts`) DESC NULLS FIRST LIMIT 10
Planning Time: 0.096 ms
Execution Time: 1973.816 ms
(5 rows)
a similar query via sqlite itself takes almost no time:
dlg@defeat hass$ time sqlite3 home-assistant_v2.db 'select state_id,metadata_id,state,last_updated_ts from states order by last_updated_ts desc limit 10'
265708270|848|off|1735884102.15356
265708269|609|5.9|1735884102.15348
265708268|475|off|1735884102.15329
265708267|8171|5849.6|1735884099.05672
265708266|633|8.3|1735884099.05631
265708265|830|278.1|1735884099.05625
265708264|533|1640.79|1735884099.05617
265708263|439|6868|1735884099.05609
265708262|304|5222.36|1735884099.05602
265708261|608|5.9|1735884099.05593
real 0m0.002s
user 0m0.001s
sys 0m0.001s
would it be possible to order by the "unwrapped" float value?
would it be possible to order by the "unwrapped" float value?
If in a table will be Infinity or -Inf or NaN or other special IEEE value value with text affinity, here will no ISO:SQL result, but very strange order of values. Does you need this, @dgwynne ?
UPD:
If yes, I can help you how to implement column option for fast, but not ISO:SQL behaviour in case of you are sure there are only values with real affinity.
In this situation there are only reals. there's no infinity, nans, etc.
In this situation there are only reals. there's no infinity, nans, etc.
I think, you potentially can use option column_type with real value for fast implementation of processing for your case.
This is key lines of C code
https://github.com/pgspider/sqlite_fdw/blob/af42d07f66c532acf91f099ba011e6feea86afc7/deparse.c#L2258-L2269
options usage example is also in the same function and near
https://github.com/pgspider/sqlite_fdw/blob/af42d07f66c532acf91f099ba011e6feea86afc7/deparse.c#L2224-L2235
Testcases (TCs) about your situation you can add after https://github.com/pgspider/sqlite_fdw/blob/af42d07f66c532acf91f099ba011e6feea86afc7/sql/17.0/types/float4.sql#L853
This can be something like
ALTER TABLE ... ALTER COLUMN "i" OPTIONS (ADD column_type 'real');
-- Testcase {NEXT}
EXPLAIN (VERBOSE, COSTS OFF)
{YOUR QUERY}
@dgwynne , you can base on my draft and add needed TCs. Are you ready to support your own PR with requested functional and regress tests for saving this functional after future PRs?
i can try
i can try
Please refer https://github.com/pgspider/fdw_testing_scripts and do reset; ../new_mulver_cycle.sh sqlite_fdw ''; between your changes. Happy implementation and debugging!
https://github.com/dgwynne/sqlite_fdw/tree/pure_float seems to work. your draft was very close, just needed one tweak.
i just have to figure test cases out now.
@dgwynne , we have forgotten
if (qualify_col)
ADD_REL_QUALIFIER(buf, varno)
for code without unification. This is obligatory macros for table name before column name if needed.
@dgwynne , we have forgotten
if (qualify_col) ADD_REL_QUALIFIER(buf, varno)for code without unification. This is obligatory macros for table name before column name if needed.
i've just pushed this to my branch.
still need to look at tests.
still need to look at tests.
@dgwynne , which of tests you can repeat for a new implementation? Look at https://github.com/pgspider/sqlite_fdw/commit/c81bc847570fa25f545dcbde50e5ef356ed86555 and add (repeat) equal test set for your implementation. For example
SELECT c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15, c17, c18, c19, c2, c21, c22, c23, c24 FROM alltypetest;
gives SQLite query: SELECT `c1`, `c2`, `c3`, `c4`, `c5`, `c6`, `c7`, `c8`, `c9`, `c10`, `c11`, `c12`, `c13`, `c14`, `c15`, sqlite_fdw_float(`c17`), sqlite_fdw_float(`c18`), sqlite_fdw_float(`c19`), sqlite_fdw_float(`c21`), sqlite_fdw_float(`c22`),c23, c24 FROM alltypetest;.
You can set a new option to some columns and test EXPALIN VERBOSE again. Other ideas see in the commit.
@dgwynne @mkgrgis Let me confirm about the problem.
If we use ORDER BY last_updated_ts, it takes almost no time.
If we use ORDER BY sqlite_fdw_float(last_updated_ts), it takes much more time.
Is my understanding correct?
@lamdn1409 , I think you did complete description of problem. Isn't it, @dgwynne ?
Also I can add sqlite_fdw_float is a deterministic function for SQLite execution context, hence it potentially can be cached.
Ping, @dgwynne ! Any problems? My plan of new TCs:
- Explain
SELECT-> error because of any ∞ value withtextaffinityDELETE...WHERE typeof(...) = 'text'SELECT-> OK