snowflake-sqlalchemy
snowflake-sqlalchemy copied to clipboard
SNOW-889850: Add autoincrement support without (explicit) sequences
What is the current behavior?
Currently, if you specify a column with autoincrement=True
you get errors. If you add a sequence in an ORM, then it can work but Alembic doesn't autogenerate the sequence. However, autoincrement works without explicit treatment of sequences when working in plain SQL.
Does not work / autogenerate correct Alembic migration:
id = Column(Integer, primary_key=True, autoincrement=True)
Also does not work without explicit CreateSequence
:
id = Column(Integer, Sequence("my_id_seq"), primary_key=True)
Works:
# Alembic
op.execute(CreateSequence(Sequence("my_id_seq")))
# ORM
id = Column(Integer, Sequence("my_id_seq"), primary_key=True)
What is the desired behavior?
It would be great to support a simple autoincrement
column without having to deal with sequences. This seems to work in a Snowflake workbook:
-- Create a table with an AUTOINCREMENT column, no sequence declared.
CREATE OR REPLACE TABLE my_tmp_table (
ID NUMBER(38,0) AUTOINCREMENT,
SENSOR_NAME VARCHAR(64) NOT NULL,
DATA_VALUE FLOAT NOT NULL,
unique (ID) rely ,
primary key (ID, SENSOR_NAME) rely
);
-- A plain insert with no id specified works without error.
INSERT INTO my_tmp_table (sensor_name, data_value) VALUES ('abc', 9.99), ('cde', 88.8);
-- Inserting values via a MERGE works without error as well.
-- Repeating the MERGE does not create more rows, as expected.
MERGE INTO
my_tmp_table
USING (
SELECT * FROM (
VALUES
('fgh', 2.22),
('ijk', 3.33)
) AS value_list (sensor_name, data_value)
) AS vlist
ON
my_tmp_table.sensor_name = vlist.sensor_name
AND my_tmp_table.data_value = vlist.data_value
WHEN NOT MATCHED
THEN INSERT (
sensor_name,
data_value
)
VALUES (
vlist.sensor_name,
vlist.data_value
)
;
-- All the rows show up with autogenerated sequential IDs.
SELECT * FROM my_tmp_table;
How would this improve snowflake-connector-python
?
This would simplify the usage of the AUTOINCREMENT
feature, and also improve Alembic compatibility by avoiding sequences.
References, Other Background
- Alembic doesn't support create/drop sequences
- autoincrement issues requiring use of sequences: https://github.com/snowflakedb/snowflake-sqlalchemy/issues/124 , https://github.com/snowflakedb/snowflake-sqlalchemy/issues/117
- CREATE TABLE docs mention that is uses sequences but don't show creating a sequence in the examples
Hey this makes sense to me but it won't be in the prioritized list of items that we will work on in the short term. We are happy to review PR though :)
Thanks for the response @sfc-gh-sfan . (I don't have any experience with SqlAlchemy/Alembic internals so I probably won't make a PR.)
I was surprised to see that Alembic ops like sa.Column("my_col", sa.Integer(), nullable=False)
are producing autoincrement
s: MY_COL NUMBER(38,0) NOT NULL autoincrement start 1 increment 1 noorder,
.
I'm also still getting the sqlalchemy.orm.exc.FlushError: Instance <MyOrmClass at 0x7f7fd3a4e470> has a NULL identity key.
error described in #124 . I tried some variations of the ORM delaration looking at discussion on this SO post: Column(Integer, primary_key=True, nullable=False, autoincrement=True)
, without nullable
(no difference), without primary_key
(causes ORM issues with relationships), with a server_default
(no difference).
So, I haven't found any way to use AUTOINCREMENT
through SqlAlchemy.
i made a base.py
like this:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, Sequence
from sqlalchemy.orm import declared_attr
class CustomBase:
# Use declared_attr to define a property that SQLAlchemy should treat as a column
@declared_attr
def id(cls):
# Use cls.__tablename__ to dynamically create a sequence name based on the table name
return Column(Integer, Sequence(f"{cls.__tablename__}_id_seq"), primary_key=True)
# Create the base class using declarative_base(), passing in your custom base as a mixin
Base = declarative_base(cls=CustomBase)
And in my alembic migration, i have:
from sqlalchemy.schema import Sequence, CreateSequence, DropSequence
tables = [ ... ] # doing this bc i created 20 tables already before figuring out i had ID problems :)
def upgrade() -> None:
for table in tables:
sequence_name = f"{table}_id_seq"
op.execute(CreateSequence(Sequence(sequence_name)))
Seems to be working, but getting some interesting results on snowflake:
Hopefully this is inspiration to help someone else.