fastapi-pagination
fastapi-pagination copied to clipboard
`Total` returns more values than expected (not applying distinct on count)
Hi,
I have the tables Author
and Books
.
Author | Book |
---|---|
Alice | A |
Alice | B |
Bob | C |
In my FastAPI endpoint, when paginating the select(Author).join(Books)
I want to respond with a list that has 2 dicts:
- Alice (with her two books)
- Bob (with his one book)
It works great but the total
is off. Instead of counting the unique Authors, it counts the total rows of the query.
Is there an option to fix that?
Perhaps relevant:
- using FastAPI (+pydantic)
- async PostgreSQL
Hi @Riya-900,
Could you please change your query to smth like this:
select(Author).options(selectinload(Author.books))
@Riya-900 Hello! Have you resolved this issue? If yes, then how did you do it?
I passed the total_group_by
parameter to the count_query
function and called query.group_by(total_group_by)
to ensure the correct calculation of total:
# file: fastapi_pagination/ext/sqlalchemy.py
def count_query(query: Select, *, use_subquery: bool = True, total_group_by=None) -> Select:
query = query.order_by(None).options(noload("*"))
if total_group_by:
query = query.group_by(total_group_by)
if use_subquery:
return select(func.count()).select_from(query.subquery())
return query.with_only_columns( # noqa: PIE804
func.count(),
**{"maintain_column_froms": True},
)
@uriyyo, perhaps this can help you improve your product.
@bnku I guess the issue is with how you load relationships. I am not sure that we need to use group_by
to get count
.
@Riya-900 Any updates, does this comment help you?
https://github.com/uriyyo/fastapi-pagination/issues/674#issuecomment-1551275725
@uriyyo In my use case, I really need to use .join()
. How can I fix this issue?
Hi @lqmanh,
Could you please show an example?
@uriyyo Sorry for my late reply, but I'm doing some kind of 3-way joining like this:
class Status(DeclarativeBase):
group_id: Mapped[UUID]
class Item(DeclarativeBase):
status: Mapped[Status] = relationship(...)
documents: Mapped[list[Document]] = relationship(...)
class Document(DeclarativeBase):
item_id: Mapped[UUID]
status_group_id: Mapped[UUID]
stmt = (
select(Item)
.join(Item.status)
.join(Item.documents)
.where(Status.group_id == Document.status_group_id) # only include documents of current status group
.options(
contains_eager(Item.status),
contains_eager(Item.documents),
)
)
UPDATED: Can fastapi-pagination support distinct count, by passing column(s) to paginate()
? paginate(..., count_distinct=Item.id)
for example.
@lqmanh Could you try to change your query to:
stmt = (
select(Item)
.join(Item.status)
.join(Item.documents)
.options(
contains_eager(Item.status),
selectinload(Item.documents), # use selectinload() to load all documents
)
)
@uriyyo ~Then how do I filter only documents associated with current status group?~ Ah sorry, misread your reply. Using selectinload or joinedload independently won't work anyway
@lqmanh If you have your relationship configured correctly, it will be done automatically
@lqmanh Could you please config like this:
from __future__ import annotations
from operator import and_
from typing import Any
from uuid import UUID, uuid4
from sqlalchemy import ForeignKey, create_engine, select
from sqlalchemy.orm import (
DeclarativeBase,
Mapped,
joinedload,
mapped_column,
relationship,
sessionmaker,
)
from fastapi_pagination import Page, Params, set_page
from fastapi_pagination.ext.sqlalchemy import paginate
engine = create_engine("sqlite:///:memory:", echo=True)
class Base(DeclarativeBase):
pass
class Status(Base):
__tablename__ = "statuses"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
item_id: Mapped[UUID] = mapped_column(ForeignKey("items.id"))
group_id: Mapped[UUID] = mapped_column()
item_documents: Mapped[list[Document]] = relationship(
primaryjoin=lambda: and_(
Document.item_id == Status.item_id,
Document.status_group_id == Status.group_id,
),
foreign_keys=[item_id, group_id],
viewonly=True,
uselist=True,
)
class Item(Base):
__tablename__ = "items"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
status: Mapped[Status] = relationship()
documents: Mapped[list[Document]] = relationship()
class Document(Base):
__tablename__ = "documents"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
item_id: Mapped[UUID] = mapped_column(ForeignKey("items.id"))
status_group_id: Mapped[UUID]
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
with Session() as session:
group_id_1 = uuid4()
group_id_2 = uuid4()
session.add_all([
Item(
status=Status(
group_id=group_id_1,
),
documents=[
Document(status_group_id=group_id_1),
Document(status_group_id=group_id_1),
Document(status_group_id=group_id_2),
],
),
Item(
status=Status(
group_id=group_id_2,
),
documents=[
Document(status_group_id=group_id_1),
Document(status_group_id=group_id_2),
Document(status_group_id=group_id_2),
],
),
])
session.commit()
with set_page(Page[Any]), Session() as session:
stmt = (
select(Item)
.options(joinedload(Item.status).joinedload(Status.item_documents))
)
page = paginate(session, stmt, Params())
print(page)
@lqmanh Or you can try this config:
from __future__ import annotations
from operator import and_
from typing import Any
from uuid import UUID, uuid4
from sqlalchemy import ForeignKey, create_engine, select
from sqlalchemy.orm import (
DeclarativeBase,
Mapped,
joinedload,
mapped_column,
relationship,
sessionmaker,
)
from fastapi_pagination import Page, Params, set_page
from fastapi_pagination.ext.sqlalchemy import paginate
engine = create_engine("sqlite:///:memory:", echo=True)
class Base(DeclarativeBase):
pass
class Status(Base):
__tablename__ = "statuses"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
item_id: Mapped[UUID] = mapped_column(ForeignKey("items.id"))
group_id: Mapped[UUID] = mapped_column()
class Item(Base):
__tablename__ = "items"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
status: Mapped[Status] = relationship()
documents: Mapped[list[Document]] = relationship()
status_documents: Mapped[list[Document]] = relationship(
secondary=Status.__table__,
primaryjoin=lambda: Status.item_id == Item.id,
secondaryjoin=lambda: and_(
Document.item_id == Status.item_id,
Document.status_group_id == Status.group_id,
),
viewonly=True,
)
class Document(Base):
__tablename__ = "documents"
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
item_id: Mapped[UUID] = mapped_column(ForeignKey("items.id"))
status_group_id: Mapped[UUID]
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
with Session() as session:
group_id_1 = uuid4()
group_id_2 = uuid4()
session.add_all([
Item(
status=Status(
group_id=group_id_1,
),
documents=[
Document(status_group_id=group_id_1),
Document(status_group_id=group_id_1),
Document(status_group_id=group_id_2),
],
),
Item(
status=Status(
group_id=group_id_2,
),
documents=[
Document(status_group_id=group_id_1),
Document(status_group_id=group_id_2),
Document(status_group_id=group_id_2),
],
),
])
session.commit()
with set_page(Page[Any]), Session() as session:
stmt = (
select(Item)
.options(
joinedload(Item.status),
joinedload(Item.status_documents),
)
)
page = paginate(session, stmt, Params())
print(page)
@uriyyo Looks nice, but it won't work without setting Status.item_id
, which isn't true because Status-Item is one-to-many
Actually, I managed to solve this issue by applying distinct()
to count_query()
:
def count_query(query: Select, *, use_subquery: bool = True) -> Select:
query = query.order_by(None).options(noload("*"))
if use_subquery:
return select(func.count()).select_from(query.distinct().subquery())
return query.distinct().with_only_columns( # noqa: PIE804
func.count(),
**{"maintain_column_froms": True},
)
Do you think it works for all cases?
Hi all,
Sorry for the long response.
Unfortunately, we can't fix this issue using distinct count cause it will break the way pagination works. You need to play around with the way you fetch relationships, it should help to solve this issue.