sqlmodel
sqlmodel copied to clipboard
Many to many relationship between a table and itself
First Check
- [X] I added a very descriptive title to this issue.
- [X] I used the GitHub search to find a similar issue and didn't find it.
- [X] I searched the SQLModel documentation, with the integrated search.
- [X] I already searched in Google "How to X in SQLModel" and didn't find any information.
- [X] I already read and followed all the tutorial in the docs and didn't find an answer.
- [X] I already checked if it is not related to SQLModel but to Pydantic.
- [X] I already checked if it is not related to SQLModel but to SQLAlchemy.
Commit to Help
- [X] I commit to help with one of those options 👆
Example Code
from typing import List, Optional
from sqlmodel import Field, Relationship, SQLModel
class Edge(SQLModel, table=True):
from_node_id: Optional[int] = Field(default=None, foreign_key="node.id", primary_key=True)
to_node_id: Optional[int] = Field(default=None, foreign_key="node.id", primary_key=True)
class Node(SQLModel, table=True):
id: Optional[int] = Field(default=None, primary_key=True)
from_nodes: List["Node"] = Relationship(back_populates="to_nodes", link_model=Edge)
to_nodes: List["Node"] = Relationship(back_populates="from_nodes", link_model=Edge)
Description
I want to be able to create a many to many relationship between a table and itself. In my example, I have a table Node
and a table Edge
which I'm using to represent a graph. I want the table Edge
to work as a link table establishing a many to many relationship between the table Node
and itself. Like that, I would be able to benefit from the SQLModel's lazy loading when querying the nodes and also get the in and out adjacency lists for each node very efficiently.
Please, let me know if it's already possible somehow.
Wanted Solution
The solution I am thinking about is to add a field
parameter to the Relationship
callable. Check out my wanted code.
Wanted Code
from typing import List, Optional
from sqlmodel import Field, Relationship, SQLModel
class Edge(SQLModel, table=True):
from_node_id: Optional[int] = Field(default=None, foreign_key="node.id", primary_key=True)
to_node_id: Optional[int] = Field(default=None, foreign_key="node.id", primary_key=True)
class Node(SQLModel, table=True):
id: Optional[int] = Field(default=None, primary_key=True)
from_nodes: List["Node"] = Relationship(back_populates="to_nodes", link_model=Edge, field="to_node_id")
to_nodes: List["Node"] = Relationship(back_populates="from_nodes", link_model=Edge, field="from_node_id")
Alternatives
I couldn't think about any decent alternative. Please, let me know if you have another idea.
Operating System
Linux
Operating System Details
No response
SQLModel Version
0.0.6
Python Version
Python 3.9.12
Additional Context
No response
What you can already do is set up relationships back-and-forth between the Node
and Edge
tables, such that
- every
Node
has a list of all outgoing edges and a list of all incoming edges and - every
Edge
has references to its "to"- and "from"-nodes.
Here is an example:
from typing import Optional
from sqlmodel import Field, Relationship, SQLModel
class Node(SQLModel, table=True):
id: Optional[int] = Field(default=None, primary_key=True)
edges_out: list["Edge"] = Relationship(
back_populates="from_node",
sa_relationship_kwargs={
"foreign_keys": "Edge.from_node_id",
"lazy": "selectin",
},
)
edges_in: list["Edge"] = Relationship(
back_populates="to_node",
sa_relationship_kwargs={
"foreign_keys": "Edge.to_node_id",
"lazy": "selectin",
},
)
class Edge(SQLModel, table=True):
from_node_id: Optional[int] = Field(
default=None,
foreign_key="node.id",
primary_key=True,
)
from_node: Optional[Node] = Relationship(
back_populates="edges_out",
sa_relationship_kwargs={"foreign_keys": "Edge.from_node_id"},
)
to_node_id: Optional[int] = Field(
default=None,
foreign_key="node.id",
primary_key=True,
)
to_node: Optional[Node] = Relationship(
back_populates="edges_in",
sa_relationship_kwargs={"foreign_keys": "Edge.to_node_id"},
)
def __repr__(self) -> str:
return f"Edge({self.from_node_id} -> {self.to_node_id})"
The foreign_keys
keyword argument is important. (See SQLAlchemy docs here.)
This allows you to quite a few things already. To get adjacency lists however you will still need the additional "hop" via the related edge
objects.
Here is a demo:
def main() -> None:
from sqlmodel import Session, create_engine, select
engine = create_engine("sqlite:///", echo=True)
SQLModel.metadata.create_all(engine)
session = Session(engine)
n1, n2, n3 = Node(), Node(), Node()
session.add_all([n1, n2, n3])
e1 = Edge(from_node=n1, to_node=n2)
e2 = Edge(from_node=n2, to_node=n3)
e3 = Edge(from_node=n1, to_node=n3)
session.add_all([e1, e2, e3])
session.commit()
nodes = session.execute(select(Node)).scalars().all()
for node in nodes:
print(f"{node.id=}")
print(f" {node.edges_out=}")
print(f" {node.edges_in=}")
print(f" to_nodes={[edge.to_node.id for edge in node.edges_out]}")
print(f" from_nodes={[edge.from_node.id for edge in node.edges_in]}")
if __name__ == "__main__":
main()
Output:
node.id=1
node.edges_out=[Edge(1 -> 2), Edge(1 -> 3)]
node.edges_in=[]
to_nodes=[2, 3]
from_nodes=[]
node.id=2
node.edges_out=[Edge(2 -> 3)]
node.edges_in=[Edge(1 -> 2)]
to_nodes=[3]
from_nodes=[1]
node.id=3
node.edges_out=[]
node.edges_in=[Edge(2 -> 3), Edge(1 -> 3)]
to_nodes=[]
from_nodes=[2, 1]
It would be really nice, if we could define a to_nodes
and a from_nodes
attribute on Node
directly to avoid going via the Edge
instances. But for this we need the association proxy from SQLAlchemy to work and I believe it is currently not supported in SQLModel. (At least I could not get it to work right now.)
Support for a custom association proxy might be a worthwhile feature in the future, but should maybe not be high priority IMHO.
Looks the same as this https://github.com/tiangolo/sqlmodel/issues/89
Thanks @daniil-berg, your answer already helped to improve my understanding about advanced usage of relationships!
In your answer above you have solved the case (as you stated) that every "Node" has a list of all outgoing edges and a list of all incoming edges
.
While this should also be true for my use case I need also that edges can have multiple incoming and outgoing nodes. Just think about a receipe where (simplified) Dough
and Butter
make a Cake
. Dough
, Butter
and Cake
are Node
instances and the baking process is an instance of the Edge
model. Of course at the same time you could use the node Butter
also for a different baking edge
.
Ideally, the Edge
model should also have additional fields (e.g. to stay with the baking analogy, the baking temperature).
I tried to update your solution by setting in the Edge
model the relationships from_node
and to_note
as a List
but this did not work unfortunately.
Use "remote_side" in "sa_relationship_kwargs"
class Node(SQLModel, table=True):
id: Optional[int] = Field(primary_key=True)
name: str
father_id: Optional[int] = Field(foreign_key="node.id")
father: Optional["Node"] = Relationship(
back_populates="childs",
sa_relationship_kwargs={"remote_side": "Node.id"}
)
childs: List["Node"] = Relationship(back_populates="father")
This seems to work for me for a self-referential many-to-many relationship with relationship attributes.
class Node(SQLModel, table=True):
id: Optional[int] = Field(default=None, primary_key=True)
edge_to: list["Edge"] = Relationship(
back_populates="from_node",
sa_relationship_kwargs={
"primaryjoin": "Node.id==Edge.from_node_id",
},
)
edge_from: list["Edge"] = Relationship(
back_populates="to_node",
sa_relationship_kwargs={
"primaryjoin": "Node.id==Edge.to_node_id",
},
)
class Edge(SQLModel, table=True):
from_node_id: int = Field(default=None, foreign_key="node.id", primary_key=True)
from_node: "Node" = Relationship(
back_populates="edge_to",
sa_relationship_kwargs={"primaryjoin": "Edge.from_node_id==Node.id"},
)
to_node_id: int = Field(default=None, foreign_key="node.id", primary_key=True)
to_node: "Node" = Relationship(
back_populates="edge_from",
sa_relationship_kwargs={"primaryjoin": "Edge.to_node_id==Node.id"},
)
link_type: str | None