sqlalchemy2-stubs icon indicating copy to clipboard operation
sqlalchemy2-stubs copied to clipboard

Supporting __get__ and __set__ on Column object

Open rslinckx opened this issue 4 years ago • 108 comments

The old sqlalchemy-stubs from dropbox included an approximation for the behavior of mapped classes using declarative setup, that is

class Foo(Model):
  a = Column(DateTime)

Foo.a
# type is Column[DateTime]
Foo().a
# type is datetime

They did so by defining the get method on Column here: https://github.com/dropbox/sqlalchemy-stubs/blob/master/sqlalchemy-stubs/sql/schema.pyi#L131

Maybe the same can be done for set, so that

Foo().a = 3
# wrong type, expected datetime object

Of course the whole thing is supported already when using the mypy plugin, but for example when using pylance (vscode type engine) which does not support plugins, it gets confused when using and assigning object values as mapped columns and complains about mismatching types.

I think the same is done on relationship() property

I'm unsure of the deeper implications of adding such type hints, though

rslinckx avatar Sep 01 '21 13:09 rslinckx

id have to look more but at the moment I'm +0 leaning towards -1 because it seems like it wont work consistently for all kinds of mapper scenarios, also our "_T" is not "int" it's Integer, so it may not be possible in any case.

zzzeek avatar Sep 01 '21 13:09 zzzeek

Hi,

As mike mentioned, I think the best we can do is to return Any in every column, since we don't have a python type in the column. That's assuming it does not pose a problem for the plugin

CaselIT avatar Sep 01 '21 13:09 CaselIT

I think what they've done in the dropbox stub is reassign the various sqlalchemy types to their 'equivalent' python types, for better or worse (aka Integer = int). I have to admit it's not particularly pretty and probably wrong in many cases. I'm just looking for a way to be able to use the pylance type checker without having every other line underlined because "Column[Integer] can't be assigned int" which makes it really hard to get proper type checking

rslinckx avatar Sep 01 '21 13:09 rslinckx

yeah i specifically did not want to do that, and I had assumed the plugin would be relied upon. im hoping tools like pylance add logic that's specific to SQLAlchemy for this since other Python libs including dataclasses etc. need special rules too.

zzzeek avatar Sep 01 '21 13:09 zzzeek

Would there be a way to define some kind of manual type where Class.col would return the regular Column[Integer] from sqlalchemy, and Class().col would return a manually defined python type, something like:

class Foo:
   a: ManuallySpecifiedColumn[int, Integer] = Column(Integer)

?

rslinckx avatar Sep 01 '21 13:09 rslinckx

yeah i specifically did not want to do that, and I had assumed the plugin would be relied upon. im hoping tools like pylance add logic that's specific to SQLAlchemy for this since other Python libs including dataclasses etc. need special rules too.

I think they've made it pretty clear they would never add anything specific which is not in the stdlib :(

rslinckx avatar Sep 01 '21 13:09 rslinckx

its a bug in pylance. extensions are needed for Python typing, they need to support them.

zzzeek avatar Sep 01 '21 13:09 zzzeek

See for example: https://github.com/microsoft/pylance-release/issues/845#issuecomment-801202038

rslinckx avatar Sep 01 '21 14:09 rslinckx

hybrid properties should work completely, they are plain descriptors, @CaselIT do we not have hybrids working yet?

zzzeek avatar Sep 01 '21 14:09 zzzeek

wow apparntly not. i had a complete working vresion of hybrids i have to find it now

zzzeek avatar Sep 01 '21 14:09 zzzeek

hybrid properties should work completely, they are plain descriptors, @CaselIT do we not have hybrids working yet?

I was not aware we had a problem there :)

I'll try looking if supporting returning Any form the __get__ works or if it causes issues to mypy.

One thing to note is that the columns are not descriptors in code

CaselIT avatar Sep 01 '21 16:09 CaselIT

@rslinckx can you try https://github.com/sqlalchemy/sqlalchemy2-stubs/pull/172 ?

CaselIT avatar Sep 01 '21 20:09 CaselIT

I've made class level column return mapped, since that's what is actually there at runtime, instead of column as in the dropbox stubs

CaselIT avatar Sep 01 '21 20:09 CaselIT

The pull request does suppress the red wiggles, in that it now considers an object-level as Any, and so is compatible with anything. it does not suppress errors on the set side, though so Obj().a = 4 will trigger an error because int is not compatible with Column[str]

I was wondering if it would be acceptable to add a Generic parameter to the Column class representing the python type of the data coming in and out of the column so that get can return either -> Mapped[python type] at class level or -> python type at object level?

rslinckx avatar Sep 03 '21 10:09 rslinckx

though so Obj().a = 4 will trigger an error because int is not compatible with Column[str]

is this when using pyright?

I was wondering if it would be acceptable to add a Generic parameter to the Column class representing the python type of the data coming in and out of the column

this is not feasible at the moment, without changes at how the plugin and the stubs work, It was an explicit choice to make Column generic on the type engine object and not on the python type, since this is more similar to how the code actually works.

CaselIT avatar Sep 03 '21 10:09 CaselIT

I want to look into moving to a new style of declarative that eliminates these issues. we already have declarative forms that work fully without a plugin, such as https://docs.sqlalchemy.org/en/14/orm/extensions/mypy.html#mapping-columns-with-imperative-table ; the second example at https://docs.sqlalchemy.org/en/14/orm/extensions/mypy.html#what-the-plugin-does shows how Mapped is used by the plugin.

I think for this issue, instead of creating more types that "lie", we should introduce new declarative forms that produce correct typing up front. I propose using Mapped and/or some new kind of declarative form that simply does all the things that we want.

Here's a very dramatically different form, which is not really correct but something like this, which could easily ride on top of the existing declarative mechanics:

class A(Base):
    __tablename__ = 'a'

   id = m.int(primary_key=True)
   name = m.string(datatype=String)
   some_expr = m.int(expr=<some sql expression>)
   bs = m.related("B")

there's no reason we have to keep having people use Column etc in their mappings if all the tools are working against it. let's work with the tools

zzzeek avatar Sep 03 '21 12:09 zzzeek

I was wondering if it would be acceptable to add a Generic parameter to the Column class representing the python type of the data coming in and out of the column so that get can return either -> Mapped[python type] at class level or -> python type at object level?

IMO it's a failure of the typing system that a type that is essentially Column[Integer[int]] can't be represented without losing features. Instead of Column[int] we have Mapping[int] and we should just try to make the latter do what we need. Column should really not have anything to do with this.

zzzeek avatar Sep 03 '21 12:09 zzzeek

are you using pylance in "basic" type checking mode?

zzzeek avatar Sep 03 '21 12:09 zzzeek

I think I just need to know what we want here. Do we want to be typing out [int] [str] etc in our mappings? Here's a quick POC, I get no squigglys in pylance basic mode, and mypy passes too, we wouldnt even need the plugin here except for the __init__ method:

from sqlalchemy import Column
from sqlalchemy import create_engine
from sqlalchemy import ForeignKey
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import registry
from sqlalchemy.orm import relationship
from sqlalchemy.orm import Session
from typing import Any


class M:
    def __or__(self, other: Any) -> Any:
        return other


m = M()

r: registry = registry()


@r.mapped
class A:
    __tablename__ = "a"

    id: Mapped[int] = m | Column(Integer, primary_key=True)
    data: Mapped[str] = m | Column(String)
    bs: "Mapped[B]" = m | relationship("B")


@r.mapped
class B:
    __tablename__ = "b"
    id: Mapped[int] = m | Column(Integer, primary_key=True)
    a_id: Mapped[int] = m | Column(ForeignKey("a.id"))
    data: Mapped[str] = m | Column(String)


e = create_engine("sqlite://", echo=True)
r.metadata.create_all(e)

s = Session(e)

a1 = A()
x: str = a1.data

a1.data = "some str"

expr = A.data == 5

The original sqlalchemy stubs only has Any for the types coming from Column. Without a plugin, they can't determine Column[int] either unless it is declared in the code.

I propose looking into helpers like the above for this use case.

zzzeek avatar Sep 03 '21 12:09 zzzeek

class A(Base): tablename = 'a'

id = m.int(primary_key=True) name = m.string(datatype=String) some_expr = m.int(expr=) bs = m.related("B")

I don't personally like that example. It makes it very hard to understand what's being defined as a table.

I don't think we need to actually to change the declaration of something different, we can just change how the init works. (I'm not sure if it's possible also when not using a base class).

Like how pydantic works https://github.com/samuelcolvin/pydantic/blob/5ccbdcb5904f35834300b01432a665c75dc02296/pydantic/main.py#L118-L267 or even sqlmodel: https://github.com/tiangolo/sqlmodel/blob/02da85c9ec39b0ebb7ff91de3045e0aea28dc998/sqlmodel/main.py#L232-L369

If I'm not mistaken they remove the FieldInfo type from the type checking and set the correct type

CaselIT avatar Sep 03 '21 12:09 CaselIT

the __init__ of what? Column? I dont want to change Column at all. I think it's time we change that declarative points to things that aren't declarative attributes. this decision wouldn't have been made in a pep 484 world.

zzzeek avatar Sep 03 '21 12:09 zzzeek

no of the class

CaselIT avatar Sep 03 '21 13:09 CaselIT

how does that help the case where we access A().data ?

the little trick I have above works even if you don't give it a type, this seems too easy so i must be missing something

class M:
    def __or__(self, other: Any) -> Any:
        return other


m = M()

r: registry = registry()


@r.mapped
class A:
    __tablename__ = "a"

    id = m | Column(Integer, primary_key=True)
    data = m | Column(String)
    bs: "Mapped[B]" = m | relationship("B")

zzzeek avatar Sep 03 '21 13:09 zzzeek

it's essentially just putting cast(Any) around it, with less (keyboard) typing

zzzeek avatar Sep 03 '21 13:09 zzzeek

this would require code changes and plugin changes but what about a syntax like this:

@r.mapped
class A:
    __tablename__ = "a"

    # will figure out Mapped[int]
    id = m(int) | Column(Integer, primary_key=True)

    # or we can be explicit
    data  : Mapped[str] = m | Column(String)

   # or just omit it, and it works out as Any, we'd get the plugin to detect this for real typing

   x = m | Column(Integer)

zzzeek avatar Sep 03 '21 13:09 zzzeek

basically what I meant is this:

import pydantic as P
import sqlalchemy as sa
import sqlalchemy.orm as orm


class Foo(P.BaseModel):

    x: int = P.Field(22)
    y: float = P.Field(42)


reveal_type(Foo.x)
reveal_type(Foo().x)

# print(Foo.x)

r = orm.registry()
Base = r.generate_base()


class Bar(Base):
    __tablename__ = "bar"

    a: int = sa.Column(sa.Integer)
    b: str = sa.Column(sa.Text)


reveal_type(Bar.a)
reveal_type(Bar().a)

I get this output:

try.py:21: note: Revealed type is "builtins.int"
try.py:22: note: Revealed type is "builtins.int"
try.py:30: error: Variable "try.Base" is not valid as a type
try.py:30: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases
try.py:30: error: Invalid base class "Base"
try.py:33: error: Incompatible types in assignment (expression has type "Column[Integer]", variable has type "int")
try.py:34: error: Incompatible types in assignment (expression has type "Column[Text]", variable has type "str")
try.py:37: note: Revealed type is "builtins.int"
try.py:38: note: Revealed type is "builtins.int"

CaselIT avatar Sep 03 '21 13:09 CaselIT

I don't think we need to. see that pydantic manages to erase the type Field from the class and mypy is happy

CaselIT avatar Sep 03 '21 13:09 CaselIT

The above is without the mypy plugin

CaselIT avatar Sep 03 '21 13:09 CaselIT

im not understanding what is proposed.

zzzeek avatar Sep 03 '21 13:09 zzzeek

I get the same error with pyright for the example above:

try.py:21:13 - info: Type of "Foo.x" is "int"
try.py:22:13 - info: Type of "Foo().x" is "int"
try.py:30:11 - error: Expected class type but received "type" (reportGeneralTypeIssues)
try.py:30:11 - error: Argument to class must be a base class (reportGeneralTypeIssues)
try.py:33:14 - error: Expression of type "Column[Integer]" cannot be assigned to declared type "int"
  "Column[Integer]" is incompatible with "int" (reportGeneralTypeIssues)
try.py:34:14 - error: Expression of type "Column[Text]" cannot be assigned to declared type "str"
  "Column[Text]" is incompatible with "str" (reportGeneralTypeIssues)
try.py:37:13 - info: Type of "Bar.a" is "int"
try.py:38:13 - info: Type of "Bar().a" is "int"

CaselIT avatar Sep 03 '21 13:09 CaselIT