sqlite-database-integration icon indicating copy to clipboard operation
sqlite-database-integration copied to clipboard

WP_MySQL_Naive_Query_Stream

Open adamziel opened this issue 2 months ago • 3 comments

Proposes a WP_MySQL_Naive_Query_Stream to enable stream-processing large SQL files one query at a time without running out of memory.

Usage:

$stream = new WP_MySQL_Naive_Query_Stream();
$stream->append_sql( 'SELECT id FROM users; SELECT * FROM posts;' );
while ( $stream->next_query() ) {
    $sql_string = $stream->get_query();
    // Process the query.
 }

$stream->append_sql( 'CREATE TABLE users (id INT, name VARCHAR(255));' );
while ( $stream->next_query() ) {
     $sql_string = $stream->get_query();
     // Process the query.
}

$stream->mark_input_complete();
$stream->next_query(); // returns false

This class is naive because it doesn't understand what a valid query is.

We assume an invalid query if we can't get the next token and the input source is already exhausted or we have over 2MB of buffered SQL. We can't do better until the lexer provides an explicit distinction between syntax errors and incomplete input. I expect this heuristic to be sufficient in many scenarios, but it will of course fail in pathological cases such as SELECT SELECT SELECT ... without any semicolons.

Related to https://github.com/Automattic/wp-cli-sqlite-command/pull/13

Remaining work

Review this PR, reformat code, add some more comments.

cc @JanJakes @sejas

adamziel avatar Oct 09 '25 15:10 adamziel

@adamziel Thanks for sharing the draft and APIs!

This class is naive because it doesn't understand what a valid query is. We can't do better until the lexer provides an explicit distinction between syntax errors and incomplete input.

Interesting! I was only thinking of implementing the full support in the lexer. I don't think it's very hard, but probably not very easy either.

That said, do you think it makes sense to get it in in the "naive" form? With the quick fixes I did today on the WP CLI SQLite side, covered with tests, it supports everything apart from NO_BACKSLASH_ESCAPES and DELIMITER .... We have no other use case yet, so it's likely not super urgent, but eventually necessary for sure.

JanJakes avatar Oct 09 '25 16:10 JanJakes

Yeah if there isn't a use-case, I think it's fine for this to sit here until one emerges. It likely will as a part of the streaming importer work, and we may need to explore a non-naive query stream implementation for that.

adamziel avatar Oct 09 '25 18:10 adamziel

@adamziel If we run into another issue with the current WP SQLite CLI parsing, I'll definitely use this in some form. Also, when moving the CLI commands to the SQLite repo, it could make sense to give this a try.

JanJakes avatar Oct 10 '25 09:10 JanJakes