Lighter
Lighter copied to clipboard
Add support for full text search (FTS 3,4,5)
It would be nice to detect FTS tables and do special things with them. I.e. also omit FTS backing tables from Lighter schemas (they need to be included in schemas if DB recreation is on, but no regular Swift funcs etc should be emitted for them?).
Table Detection
The tables can be detected by looking at the SQL in the metadata:
CREATE VIRTUAL TABLE papers USING fts3(...)
i.e. "VIRTUAL TABLE" and "USING fts*".
FTS tables, even contentful, have no associated column types (though they should be all TEXT?).
FTS creates a set of supporting, regular, tables that should be omitted, FTS 3:
papers_content
papers_segments
papers_segdir
FTS 4:
papers4_content
papers4_segments
papers4_segdir
papers4_docsize
papers4_stat
FTS 5:
email_data
email_idx
email_content
email_docsize
email_config
So first the FTS table needs to be detected, than the associated tables need to be tagged/excluded.
External content FTS tables can refer to a backing table ("documents" in this case):
USING fts5(title, value, content='documents')
So that can be detected too (i.e. whether a regular table has one (or more! e.g. for multiple languages) FTS backing tables).
Operations
It is unclear what kind of operations would need to be supported? Everything in FTS centers around MATCH
queries, like:
SELECT * FROM mail WHERE body MATCH 'sqlite';
Note that FTS table columns are completely untyped (but except for rowid
always TEXT
?)
The rowid
is also more important to match up w/ the content table and needs to be exposed? (e.g. it can be aliased to an integer primary key and is also stable between VACUUMs).
Then FTS also has a set of special functions, like highlight and snippet:
SELECT highlight(email, 2, '<b>', '</b>') FROM email WHERE email MATCH 'fts5';
2 is the 0 based column index here ("email" is the FTS table). Also note that the table can be used on the left side of the match.
If an FTS table has a backing table (content='mytable'
), maybe the ops should be directly attached to the content table, like sqlite_mytable_fts(...)
.
FTS 5 query language:
<phrase> := string [*]
<phrase> := <phrase> + <phrase>
<neargroup> := NEAR ( <phrase> <phrase> ... [, N] )
<query> := [ [-] <colspec> :] [^] <phrase>
<query> := [ [-] <colspec> :] <neargroup>
<query> := [ [-] <colspec> :] ( <query> )
<query> := <query> AND <query>
<query> := <query> OR <query>
<query> := <query> NOT <query>
<colspec> := colname
<colspec> := { colname1 colname2 ... }
Ordering has a special rank
column (ORDER BY rank
), that would need to be made available.