python3-driver icon indicating copy to clipboard operation
python3-driver copied to clipboard

Support for SQL scripts?

Open fdcastel opened this issue 6 months ago • 4 comments

What's the current support for SQL scripts?

Reviewing the documentation and source code, it appears there is none.

  • Are there any known workarounds?
  • Any plans to implement this kind of support?

Something like:

SET TERM ^^ ; 

CREATE OR ALTER PROCEDURE STP_ONE...
AS
BEGIN
END
^^ 

CREATE PROCEDURE STP_TWO...
AS
BEGIN
END
^^ 

Or, even better

CREATE OR ALTER PROCEDURE STP_ONE...
AS
BEGIN
END;

CREATE PROCEDURE STP_TWO...
AS
BEGIN
END;

I've never been a fan of SET TERM. And most SQL databases get by just fine without it.

Any parser could easily track BEGIN/END pairs and distinguish between semicolons inside the block and the ones that terminate the CREATE PROCEDURE command.

Happy to put together a PR if you're interested.

fdcastel avatar Jun 26 '25 21:06 fdcastel

An initial attempt:

def parse_firebird_script(script: str) -> list[str]:
    """
    Parse a Firebird SQL script into a list of statements, handling SET TERM and blocks by tracking BEGIN/END pairs only.
    Comments (-- and /* */) are ignored.
    """
    import re

    # Remove /* ... */ comments (including multiline)
    script = re.sub(r"/\*.*?\*/", "", script, flags=re.DOTALL)

    # Remove -- comments
    script = re.sub(r"--.*", "", script)

    statements = []
    current = []
    term = ";"
    block_depth = 0

    # Normalize line endings
    lines = script.replace("\r\n", "\n").replace("\r", "\n").split("\n")

    i = 0
    while i < len(lines):
        line = lines[i]

        # Handle SET TERM
        set_term_match = re.match(r"\s*SET\s+TERM\s+(.+)\s+(.+)", line, re.IGNORECASE)
        if set_term_match:
            term = set_term_match.group(1).strip()
            i += 1
            continue

        current.append(line)

        # Track block depth
        block_depth += len(re.findall(r"\bBEGIN\b", line, re.IGNORECASE))
        block_depth -= len(re.findall(r"\bEND\b", line, re.IGNORECASE))

        # End of statement: block_depth==0 and line ends with terminator
        if block_depth <= 0 and line.strip().endswith(term):
            statement = "\n".join(current).rstrip()
            if statement.endswith(term):
                statement = statement[: -len(term)]
            if statement.strip():
                statements.append(statement.strip())
            current = []
            block_depth = 0
        i += 1

    # Add any trailing statement
    if current and any(s.strip() for s in current):
        statements.append("\n".join(current).strip())

    return [s for s in statements if s.strip()]

fdcastel avatar Jun 26 '25 22:06 fdcastel

BTW: The above function is intended to handle both scenarios.

SET TERM is completely optional (but still works for backward compatibility). It also works with semicolon-only scripts.

fdcastel avatar Jun 26 '25 22:06 fdcastel

ALTER DATABASE BEGIN BACKUP

boom

asfernandes avatar Jun 27 '25 00:06 asfernandes

v2:

  • Tracks BEGIN/END pairs only for commands with <module-body>, as per the docs;
  • Do not execute SET TERM.
def parse_firebird_script(script: str) -> list[str]:
    """
    Parse a Firebird SQL script into statements, handling SET TERM and BEGIN/END blocks
    only for module body commands (procedures, triggers, functions, packages, execute blocks).
    """
    import re

    # Pattern for all module body commands
    #   https://firebirdsql.org/file/documentation/html/en/refdocs/fblangref50/firebird-50-language-reference.html#fblangref50-psql-elements-body-syntax
    module_body_pattern = re.compile(
        r"""
        \b(
            (?:CREATE|ALTER|RECREATE|CREATE\s+OR\s+ALTER)\s+
            (?:TRIGGER|PROCEDURE|FUNCTION|PACKAGE(?:\s+BODY)?)|
            EXECUTE\s+BLOCK
        )\b
        """,
        re.IGNORECASE | re.VERBOSE,
    )

    def remove_comments(line: str) -> str:
        """Remove comments from a line for logic processing."""

        # Remove -- comments
        line = re.sub(r"--.*", "", line)

        # Remove /* */ comments
        line = re.sub(r"/\*.*?\*/", "", line)
        return line

    statements = []
    current_lines = []
    terminator = ";"
    block_depth = 0
    in_module_body = False

    lines = script.replace("\r\n", "\n").replace("\r", "\n").split("\n")

    for line in lines:
        logic_line = remove_comments(line)

        # Handle SET TERM
        if match := re.match(r"\s*SET\s+TERM\s+(\S+)", logic_line, re.IGNORECASE):
            terminator = match.group(1)
            continue  # Don't include SET TERM in output

        current_lines.append(line)

        # This starts a module body command?
        if not in_module_body:
            current_text = "\n".join(current_lines)
            if module_body_pattern.search(remove_comments(current_text)):
                in_module_body = True

        # Track BEGIN/END depth only in module bodies
        if in_module_body:
            block_depth += len(re.findall(r"\bBEGIN\b", logic_line, re.IGNORECASE))
            block_depth -= len(re.findall(r"\bEND\b", logic_line, re.IGNORECASE))

        # Check for statement end
        ends_with_terminator = logic_line.strip().endswith(terminator)
        is_complete = (not in_module_body) or (block_depth <= 0)

        if ends_with_terminator and is_complete:
            statement = "\n".join(current_lines).rstrip()
            if statement.endswith(terminator):
                statement = statement[: -len(terminator)]
            if statement.strip():
                statements.append(statement.strip())

            # Reset for next statement
            current_lines = []
            block_depth = 0
            in_module_body = False

    # Add any remaining statement
    if current_lines and any(line.strip() for line in current_lines):
        statements.append("\n".join(current_lines).strip())

    return [stmt for stmt in statements if stmt.strip()]

fdcastel avatar Jun 27 '25 05:06 fdcastel