snowflake-sqlalchemy icon indicating copy to clipboard operation
snowflake-sqlalchemy copied to clipboard

SNOW-606810: Insert List into ARRAY column

Open ThomasJRyan opened this issue 2 years ago • 4 comments

What is the current behavior?

Currently it doesn't seem possible (as far as I'm able to tell) to insert a List into an ARRAY typed column.

What is the desired behavior?

The desired behaviour would be to be able to do this:

from my_app import app
from snowflake.sqlalchemy import ARRAY

db = SQLAlchemy(app)

class MyTable(db.Model):
    col = db.Column(ARRAY)

db.create_all()

obj = MyTable(col = ['Hello', 'World'])

db.session.add(obj)
db.session.flush()

# Error is raised
ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002020 (21S01): SQL compilation error
Insert value list does not match column list expecting 1 but got 2
[SQL: INSERT INTO mytable (col) VALUES (%(col)s)]
[parameters: {'col': ['Hello', 'World]}]

I know the error is coming from snowflake.connector, but if I'm understanding it correctly, the error is being raised because the generated query

INSERT INTO mytable (col) VALUES (['Hello', 'World']);

isn't actually valid in Snowflake. It'd probably need to be converted to something like

INSERT INTO mytable (col) SELECT PARSE_JSON('["Hello", "World"]');

to work properly.

I'm not really sure of a way to do this outside of just writing out the query myself which seems counterintuitive to the purpose of SQLAlchemy. If I'm missing something obvious I apologize for the bother

How would this improve snowflake-connector-python?

Would allow users to easily add data to ARRAY columns

References, Other Background

Any examples are using Flask-SQLAlchemy to act as a wrapper for Snowflake-SQLAlchemy. I'm pretty sure this is missing from the latter rather than being a bug of the former. Please correct me if I'm wrong

ThomasJRyan avatar Jun 02 '22 19:06 ThomasJRyan

Hey @ThomasJRyan, We've managed to workaround this using something like

import json
import re

from sqlalchemy.sql.expression import Insert
from sqlalchemy.ext.compiler import compiles
from sqlalchemy import event
from sqlalchemy.engine import Engine


@compiles(Insert)
def patch_insert(insert, compiler, **kw):
    """
    Patches INSERT SQL queries so sqlalchemy ORM will support Snowflake VARIANT

    NOTICE: currently it patches ONLY VARIANT columns which named "raw"

    For more information (e.g. read about the parameters), please look at:
        * https://docs.sqlalchemy.org/en/14/core/compiler.html
    (or ask Shahar)
    """
    s = compiler.visit_insert(insert, **kw)
    s = s.replace("VALUES (%(raw)s)", "SELECT PARSE_JSON(column1) as raw from values (%(raw)s)")
    return s


@event.listens_for(Engine, "before_cursor_execute", retval=True)
def before_cursor_execute(conn, cursor, statement, parameters, context, executemany):
    """
    Patches UPDATE SQL queries so sqlalchemy ORM will support Snowflake VARIANT

    NOTICE: currently it patches ONLY VARIANT columns which named "raw"

    For more information (e.g. to read about the parameters), please look at:
        * SQL alchemy core events - https://docs.sqlalchemy.org/en/13/core/events.html
    (or ask Shahar)
    """
    # It's a select statement for a raw object, for us to get it properly, we need to patch the
    # %(param_1)s since it contains the whole JSON object.
    if statement.startswith('SELECT') and statement.endswith('%(param_1)s') and (
            '.raw' in statement:
        # SHAHAR: A weird edge case when combining "raw" queries + limit
        if statement.endswith("LIMIT %(param_1)s"):
            pass
        else:
            statement = statement.replace('%(param_1)s', 'PARSE_JSON(%(param_1)s)')
        return statement, parameters

    # elif this isn't a update to a "raw" column, leave it alone
    if "raw" not in parameters:
        return statement, parameters

    # we currently changing only UPDATE/DELETE
    if not statement.startswith("UPDATE") and not statement.startswith("DELETE") and not statement.startswith("INSERT"):
        return statement, parameters

    statement = statement.replace("SET raw=%(raw)s", "SET raw = PARSE_JSON(%(raw)s)")
    # update the where condition
    if "WHERE" in statement:
        statement = re.sub(r'(WHERE.*= )', r'\1PARSE_JSON(', statement)
        statement += ")"
    return statement, parameters

shahargl avatar Jun 07 '22 14:06 shahargl

Sorry @shahargl I meant to reply a week ago. This is an effective workaround and has unstuck me. Thank you very much for it

Would still be nice to see a more permanent fix someday though. I might look into a way to contribute it but I can't guarantee I'd be capable enough to implement a proper solution

ThomasJRyan avatar Jun 16 '22 18:06 ThomasJRyan

Agree, but they are not showing love to this repo

shahargl avatar Jun 19 '22 11:06 shahargl

As another alternative method SQLAlchemy's Insert.from_select could also work to satisfy the INSERT INTO … SELECT requirement for direct variant values:

from sqlalchemy import select, insert, func

s = select(func.ARRAY_CONSTRUCT(['Hello', 'World']))
i = insert(MyTable).from_select([MyTable.col], s)
session.execute(i)

sfc-gh-hachouraria avatar Aug 31 '22 10:08 sfc-gh-hachouraria

To clean up and re-prioritize bugs and feature requests we are closing all issues older than 6 months as of Apr 1, 2023. If there are any issues or feature requests that you would like us to address, please re-create them. For urgent issues, opening a support case with this link Snowflake Community is the fastest way to get a response

github-actions[bot] avatar Apr 05 '23 01:04 github-actions[bot]