sqlmodel icon indicating copy to clipboard operation
sqlmodel copied to clipboard

Simple instructions for a self referential table

Open michaelmcandrew opened this issue 3 years ago • 10 comments

First Check

  • [X] I added a very descriptive title to this issue.
  • [X] I used the GitHub search to find a similar issue and didn't find it.
  • [X] I searched the SQLModel documentation, with the integrated search.
  • [X] I already searched in Google "How to X in SQLModel" and didn't find any information.
  • [X] I already read and followed all the tutorial in the docs and didn't find an answer.
  • [X] I already checked if it is not related to SQLModel but to Pydantic.
  • [X] I already checked if it is not related to SQLModel but to SQLAlchemy.

Commit to Help

  • [X] I commit to help with one of those options 👆

Example Code

class Node(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    text: str
    parent_id: Optional[int] = Field(foreign_key="node.id")
    # parent: Optional["Node"] not sure what to put here
    # children: List[Node] not sure what to put here either :)

Description

I am trying to create a simple self referential model - using the SQL model equivalent of the adjacency list pattern described here: https://docs.sqlalchemy.org/en/14/orm/self_referential.html

I am only a litte familiar with SQL alchemy but was unable to translate their example into one that would work with SQLmodel.

In your docs you said: "Based on SQLAlchemy [...] SQLModel is designed to satisfy the most common use cases and to be as simple and convenient as possible for those cases, providing the best developer experience". I was assuming that a self referential model would be a fairly common use case but totally appreciate that I could be wrong on this :)

I see that there is an sa_relationship param that you can pass 'complicated stuff' too but I was not sure whether I should be using that (or how I would do so if I was meant to) - sorry just a bit too new to this.

Crossing my fingers that it is straight forward to complete the commented lines in my example.

Operating System

Linux

Operating System Details

No response

SQLModel Version

0.0.4

Python Version

3.9.7

Additional Context

No response

michaelmcandrew avatar Oct 11 '21 22:10 michaelmcandrew

Did some more experimentation

With a model defined like so

class Node(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    text: str
    parent_id: Optional[int] = Field(foreign_key="node.id")
    children: List["Node"] = Relationship(back_populates="parent")
    parent: Optional["Node"] = Relationship(back_populates="children")

running session.exec(select(Node)).all() throws the following error;

sqlalchemy.exc.ArgumentError: Context.children and back-reference Context.parent are both of the same direction symbol('ONETOMANY'). Did you mean to set remote_side on the many-to-one side ?

Will carry on investigating...

michaelmcandrew avatar Oct 13 '21 17:10 michaelmcandrew

@michaelmcandrew did you try using sa_relationship_kwargs?

This is how I'm using it:

class AccountBase(SQLModel):
  total: float = 0.0
  name: str
 
class Account(AccountBase, table=True):
  __tablename__ = "accounts"

  id: Optional[int] = Field(default=None, primary_key=True)
  parent_id: Optional[int] = Field(default=None, foreign_key="accounts.id")
  children: List["Account"] = Relationship(
    sa_relationship_kwargs=dict(
      cascade="all",
      backref=backref("parent", remote_side="Account.id"),
    )  
  )

  def append(self, child: "Account"):
    self.children.append(child)

def main():
   with Session(engine) as session:
       account_1 = Account(total=200, name="Parent")
       account_2 = Account(total=300, name="Child")
       account_3 = Account(total=400, name="Grandson")

       account_2.append(account_3)
       account_1.append(account_2)

My references:

giubaru avatar Oct 31 '21 14:10 giubaru

Thanks @giubaru - I suspected that sa_relationship_kwargs might be part of the solution but had not got round to testing it out. I will take a look and report back.

michaelmcandrew avatar Nov 01 '21 12:11 michaelmcandrew

This is working for me @giubaru - thanks!

michaelmcandrew avatar Nov 25 '21 08:11 michaelmcandrew

Cannot use parent field

from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.orm import backref, relationship
from sqlalchemy.ext.declarative import declarative_base
from sqlmodel import Field, Relationship, SQLModel
from typing import List, Optional

Base = declarative_base()


class NodeSa(Base):
    __tablename__ = 'node_sa'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('node_sa.id'))
    children = relationship(
        'NodeSa',
        backref=backref('parent', remote_side=[id]),
    )
print(NodeSa(parent=NodeSa()).parent)  # <__main__.NodeSa object at 0x7fde515462c0>


class NodeSm(SQLModel, table=True):
    id: int = Field(primary_key=True)
    parent_id: Optional[int] = Field(default=None, foreign_key='node_sm.id')
    children: List['NodeSm'] = Relationship(
        sa_relationship_kwargs=dict(
            backref=backref('parent', remote_side='NodeSm.id'),
        ),
    )
try:
    NodeSm(parent=NodeSm()).parent
except Exception as e:
    print(type(e))  # <class 'sqlalchemy.exc.NoForeignKeysError'>

bugchecker avatar Feb 23 '22 16:02 bugchecker

Cannot use parent field

from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.orm import backref, relationship
from sqlalchemy.ext.declarative import declarative_base
from sqlmodel import Field, Relationship, SQLModel
from typing import List, Optional

Base = declarative_base()


class NodeSa(Base):
    __tablename__ = 'node_sa'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('node_sa.id'))
    children = relationship(
        'NodeSa',
        backref=backref('parent', remote_side=[id]),
    )
print(NodeSa(parent=NodeSa()).parent)  # <__main__.NodeSa object at 0x7fde515462c0>


class NodeSm(SQLModel, table=True):
    id: int = Field(primary_key=True)
    parent_id: Optional[int] = Field(default=None, foreign_key='node_sm.id')
    children: List['NodeSm'] = Relationship(
        sa_relationship_kwargs=dict(
            backref=backref('parent', remote_side='NodeSm.id'),
        ),
    )
try:
    NodeSm(parent=NodeSm()).parent
except Exception as e:
    print(type(e))  # <class 'sqlalchemy.exc.NoForeignKeysError'>

Have my own share of issues I am currently working on with this self referential stuff and came across your example.

The error is correct because the node_sm table does not exist at all.

You can fix the error by either:

removing the underscore:

- parent_id: Optional[int] = Field(default=None, foreign_key='node_sm.id')
+ parent_id: Optional[int] = Field(default=None, foreign_key='nodesm.id')

or setting the table name manually:

class NodeSm(SQLModel, table=True):
+    __tablename__ = "node_sm"
    id: int = Field(primary_key=True)
    ...

BUT then you directly get the next issue: None

When you try to print the NodeSm.parent like so:

try:
-    NodeSm(parent=NodeSm()).parent
+    print(NodeSm(parent=NodeSm()).parent)
except Exception as e:
    print(type(e))  # <class 'sqlalchemy.exc.NoForeignKeysError'>

You simply get None

EDIT

But you can work the other way around.

For readability add a new field:

class NodeSm(SQLModel, table=True):
    id: int = Field(primary_key=True)
+    name: str
    ...

Then change the way to connect them:

try:
-    print(NodeSm(name="child", parent=NodeSm()).parent)
+    child = NodeSm(name="child")
+    parent = NodeSm(name="parent")
+    parent.children.append(child)

+    print(f"{child.parent=}")
+    print(f"{parent.children=}")
except Exception as e:
    print(type(e))

The output is:

child.parent=NodeSm(name='parent', parent_id=None, children=[NodeSm(name='child', parent_id=None, parent=NodeSm(name='parent', parent_id=None, children=[...]))])
parent.children=[NodeSm(name='child', parent_id=None, parent=NodeSm(name='parent', parent_id=None, children=[...]))]

Root Cause

I also have a guess as to why you can add children, but not a parent. I think pydantic is the culprit here. Reason for this is simple:

On the NodeSm class we create an attribute called children but we never created an attribute called parent. Therefore pydantic just discards this information in the init of a new NodeSm instance. Sure the parent field exists but it is dynamically added by sqlalchemy and therefore it is unknown to pydantic.

In the line with print(f"{child.parent=}") you also get a mypy error which supports my assumption: "NodeSm" has no attribute "parent"; maybe "parent_id"? [attr-defined]mypy(error)

EDIT 2

I did some further research and found the option to allow extra attributes for pydantic models in the pydantic docs.

The default is ignore -> no Exception and when you set it so forbid is throws one, but in this case the exception is somewhere catched and does not surface. BUT when you set it to allow it works:

...
+ from pydantic import Extra
...
-class NodeSm(SQLModel, table=True):
+class NodeSm(SQLModel, table=True, extra=Extra.allow):
    id: int = Field(primary_key=True)
    ...

Unfortunately this is a solution using a hammer, as we allow all possible extra attributes, even those who does not exist. I also saw that you can alter the JSON schema of a model in the pydantic docs, which could be the safest solution.

EDIT 3

The schema manipulation does not work. It changed nothing.

class Node(SQLModel, table=True):
    id: int = Field(primary_key=True)
    parent_id: Optional[int] = Field(default=None, foreign_key='node.id')
    children: List['Node'] = Relationship(
        sa_relationship_kwargs=dict(
            backref=backref('parent', remote_side='Node.id'),
        ),
    )

    class Config:
        @staticmethod
        def schema_extra(schema: Dict[str, Any], model: Type['Node']) -> None:
            schema['properties']['parent'] = {'title': 'Parent', '$ref': '#/definitions/Node'}

TL;DR

  1. The table name in the parent_id's foreign_key declaration is wrong. Set __tablename__ or remove the underscore. (See first part of this post)
  2. Allow extra fields for pydantic: extra=Extra.allow (See EDIT 2)

Cielquan avatar Apr 01 '22 09:04 Cielquan

So far, this seems to be the best question/answer on self-referential tables for SQLModel. With all the versions and edits it seems unclear as to what the final "simple" working solution was. Would it be possible to post a definitive example if this issue was resolved?

johnziebro avatar May 12 '22 11:05 johnziebro

Well unfortunately this is only a workaround and not a real solution, because the typing information is missing. I think for a real solution there needs to be first party support for self-referential tables coming from SQLModel itself.

The project at work where this issue occurs got postponed and I will probably work on it again in the next 4 to 6 weeks, but no promises.

The TL;DR above is this workaround but to summarize, this should (I did not test it) give you a working self-referential table.

from typing import List, Optional

from pydantic import Extra
from sqlalchemy.orm import backref
from sqlmodel import Field, Relationship, SQLModel


class DBTable(SQLModel, table=True, extra=Extra.allow):

    __tablename__ = "db_table"

    id: int = Field(primary_key=True)
    parent_id: Optional[int] = Field(default=None, foreign_key="db_table.id")

    children: List["DBTable"] = Relationship(
        sa_relationship_kwargs=dict(
            backref=backref("parent", remote_side="DBTable.id"),
        ),
    )

On the DBTable class you 4 attributes:

  • id is the primary key
  • parent_id is the self-referential foreign key matching the DBTable.id of the parent
  • children is a list of DBTables which have this instance's id as parent_id
  • parent is the DBTable with the id specified as parent_id in this instance

But only the first two are actual columns in the database.

The issue with this workaround is, that the DBTable.parent attribute is dynamically created and thus typing information is missing. Therefore type checkers will complain if you use it and you cannot autocomplete it either. But because extra=Extra.allow is set pydantic won't remove the parent attribute, so it at least works.

I hope this helps.

Cielquan avatar May 12 '22 16:05 Cielquan

I created an answer for this with more detailed explanations and tests here, but this is the short version:

from typing import Optional
from sqlmodel import Field, Relationship, SQLModel


class Node(SQLModel, table=True):
    __tablename__ = 'node'  # just to be explicit

    id: Optional[int] = Field(default=None, primary_key=True)
    data: str
    parent_id: Optional[int] = Field(
        foreign_key='node.id',  # notice the lowercase "n" to refer to the database table name
        default=None,
        nullable=True
    )
    parent: Optional['Node'] = Relationship(
        back_populates='children',
        sa_relationship_kwargs=dict(
            remote_side='Node.id'  # notice the uppercase "N" to refer to this table class
        )
    )
    children: list['Node'] = Relationship(back_populates='parent')
    # with Python `<3.9` use `typing.List` instead of `list`

Instead of using the legacy backref parameter, the desired model attributes parent and children are declared explicitly. This also allows type checking to work as expected. And no need for extra=Extra.allow.

Check the Stackoverflow post I linked above for some illustrations/tests.

I agree that this would be nice to have in the official docs. Maybe I will create a PR for this.

daniil-berg avatar Aug 23 '22 14:08 daniil-berg

I created an answer for this with more detailed explanations and tests here, but this is the short version:

Thank you very much, I've lost a couple of hours testing with ForwardRef, Self, etc. without success :-(

xavipolo avatar Sep 03 '22 16:09 xavipolo

Any success with these?

I am trying to create a join table to link a model (Paper) back to other Paper objects to make a "similar papers" function.

class PaperSimilarLink(SQLModel, table=True):
    paper_id: Optional[int] = Field(
        default=None, foreign_key="paper.id", primary_key=True
    )
    similar_id: Optional[int] = Field(
        default=None, foreign_key="paper.id", primary_key=True
    )

class Paper(SQLModel, HasRelatedObjects, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    similar: List["Paper"] = Relationship(
        back_populates="similar",
        link_model=PaperSimilarLink,
        sa_relationship=dict(
            primary_join=PaperSimilarLink.paper_id,
            secondary_join=PaperSimilarLink.similar_id,
        ),
    )


without the primary_join and secondary_join it creates an error:

sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child tables on relationship PaperSection.similar - there are multiple foreign key paths linking the tables via secondary table 'papersectionsimilarlink'. Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as containing a foreign key reference from the secondary table to each of the parent and child tables.

After adding it, it runs, but I get an error trying to access Paper.similar:

AttributeError: 'Paper' object has no attribute 'similar'

philmade avatar Nov 30 '22 17:11 philmade

Fix to your Paper model:

similar: List["Paper"] = Relationship(
        link_model=PaperSimilarLink,
        sa_relationship_kwargs=dict(
            primaryjoin="Paper.id == PaperSimilarLink.paper_id",
            secondaryjoin="Paper.id == PaperSimilarLink.similar_id",
        ),
    )

Working code:

from sqlmodel import SQLModel, Field, create_engine, Session, select, Relationship
from typing import Optional, List


class PaperSimilarLink(SQLModel, table=True):
    paper_id: Optional[int] = Field(
        default=None, foreign_key="paper.id", primary_key=True
    )
    similar_id: Optional[int] = Field(
        default=None, foreign_key="paper.id", primary_key=True
    )


class Paper(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    similar: List["Paper"] = Relationship(
        link_model=PaperSimilarLink,
        sa_relationship_kwargs=dict(
            primaryjoin="Paper.id == PaperSimilarLink.paper_id",
            secondaryjoin="Paper.id == PaperSimilarLink.similar_id",
        ),
    )


engine = create_engine("sqlite:///")

with Session(engine) as session:
    SQLModel.metadata.create_all(engine)

    p1 = Paper(id=1)
    p2 = Paper(id=2)
    p3 = Paper(id=3)
    session.add_all([p1, p2, p3])
    session.commit()

    session.refresh(p1)
    session.refresh(p2)
    session.refresh(p3)

    p1.similar = [p2, p3]
    session.add(p1)
    session.commit()

    statement = select(Paper)
    paper = session.exec(statement).unique().all()

    for i in paper:
        print(i.id, i.similar)

    """
    1 [Paper(id=2), Paper(id=3)]
    2 []
    3 []
    """

meirdev avatar Nov 30 '22 18:11 meirdev