floor icon indicating copy to clipboard operation
floor copied to clipboard

Add a query parser

Open mqus opened this issue 4 years ago • 6 comments

Introduction

This issue tracks the effort of adding a parser for SQLite queries, which will enable us to do static analysis on the queries, get dependencies of DatabaseViews, be aware of affected databases when allowing arbitrary queries and much more.

I have started implementing this for a prototype and want to build on the work done in #320 .

I updated the top post to better reflect the current status.

Roadmap/TODO

  • [x] Add support for CREATE VIEW and Views in sqlparser (https://github.com/simolus3/moor/pull/563)
  • [x] Add dependency inference to sqlparser (is 80% there already, the rest has to be done from our side).
  • [ ] ~(Fix small issues in sqlparser variable support (named parameter binding over positional, support for @xyz and $abc variables))~ invalid, has to be solved from our side:
    • [ ] analyze the parameters by using a Variable Visitor and prepare as input for analyzer
  • [ ] Integrate sqlparser into floor (#361):
    • [x] Dependency inference for Streams
    • [x] Return type inference/type checking
    • [x] Better variable/parameter injection
    • [x] analyze rawQuery queries as to what things are changed and which listeners should be triggered.

Old post, not really up2date anymore because I abandoned the antlr approach in favor of the much more mature sqlparser (as mentioned below):

~The current strategy is to use the antlr SQLite grammar updated here(MIT licensed) and use the dart target of antlr as soon as it is ready:~

  • ~Generate the parser files once and add them into the git repository because they don't have to be generated frequently and we will need to have the antlr tool installed for it~
  • ~Document the generation process/commands~
  • ~Let dartanalyzer ignore the generated files~
  • ~Add the dart antlr runtime as a dependency on floor_generator~

The plan is then to give QueryAdapter.query(List)Stream a set of entities to watch out for (instead of the single entity it waits on right now). I'm not sure how to wire it up to there yet but it should not be complicated.

~I'll open a PR once I have a prototype ready. At the moment this includes a git submodule containing the antlr PR referenced above and I don't want to push that.~

mqus avatar May 02 '20 13:05 mqus

This is not really an update but more thoughts about the feature:

I thought that it may be possible to have somewhat of an effect system where Queries with a Stream as the return value are simply run again as soon as the input entity changes, regardless if it is a SELECT statement or a DELETE statement. On one hand this would make an excellent system for building triggers. On the other hand SQLite already supports triggers and we can easily introduce loops as we don't know exactly if any change has been actually made or if the query was simply a no-op. personally I find this effect system intriguing but I'm aware that it would probably introduce more problems than it would solve.

Apart from that, I went through the possible sqlite statements to parse and want to restrict this parsing effort to just a few of them. Other statements would not make sense or are too complicated to get right. As mentioned above, statements can have dependencies (updates to a dependency would change the output of the statement) and affected entities (executing the statement would change those entities)

To include:

  • SELECT-Statements (including joins, Unions, etcpp): Only has dependencies.
  • DELETE-Statements: Affects at most one entity. Could have dependencies.
  • INSERT/REPLACE-Statements: Affects at most one entity. Could have dependencies.
  • UPDATE-Statements: Affects at most one entity. Could have dependencies.

To (explicitly) exclude (for now):

  • CREATE/DROP TRIGGER-Statements: Are a special case, as they also introduce new dependencies into other statements, same as the view creation. But as they are usually not specified with an annotation but in the databaseBuilder with a callback, they will be tricky to detect.
  • statements containing a database specifier for the entity
  • statements using an entity variable (e.g. "SELECT id FROM :table"), is not allowed by the grammar right now
  • statements replacing arbitrary tokens with a query (e.g. "SELECT * FROM table :qualifiers"), is not allowed by the grammar right now
  • statements with "WITH" clauses at the start (temporary DatabaseViews), just too complicated for now. But should be possible.
  • ALTER TABLE, CREATE/DROP *, PRAGMA, ATTACH/DETACH, EXPLAIN,SAVEPOINT, TRANSACTION: they don't change or query the data itself but query/alter the scheme or other metadata. Not relevant for our functionality, considering that ORMs expect a static scheme only changed by versioned migrations.

Another thought I had right now: Another feature we can gain with a query parser is deriving the type to output for a @Query function. I didn't think deeply about this yet.

META: I should probably just start to code instead of writing novels here :wink:

mqus avatar May 02 '20 22:05 mqus

This sounds amazing! Thanks for your efforts. Did you look into the sqlparser package by any chance to find out if it can solve our problems?

Another thought I had right now: Another feature we can gain with a query parser is deriving the type to output for a @query function. I didn't think deeply about this yet.

This featured would be quite interesting and has actually been requested by users already (#94, #130)

vitusortner avatar May 03 '20 15:05 vitusortner

This sounds amazing! Thanks for your efforts. Did you look into the sqlparser package by any chance to find out if it can solve our problems?

No I didn't and this seems far easier to work with than the antlr approach. I'll definitely look into it.

This featured would be quite interesting and has actually been requested by users already (#94, #130)

Yeah I made it sound like I just had that idea but in truth I knew it but had forgotten that this is also something we can do with the query parsing and wanted to add it, just so I wont forget it again.

mqus avatar May 03 '20 15:05 mqus

Here is a list of features that depend on the query parser being implemented:

  • #58 Query Validation
  • #94 Parse queries into objects, as mentioned already.
  • #200 Return int values from sum/count queries
  • And other issues I have closed as duplicates of the mentioned issues.

mqus avatar May 05 '20 18:05 mqus

I updated the issue text to summarize the current status.

mqus avatar May 20 '20 21:05 mqus

Awesome! Thanks for your investment!

vitusortner avatar May 21 '20 05:05 vitusortner