[Improvement] Implement Pagination for List Endpoints
Is your feature request related to a problem? Please describe.
Currently, our list endpoints (e.g., /api/v1/collections/{collection_id}/documents) fetch and return all records from the database in a single response. As the amount of data in our application grows, this approach is becoming unsustainable and leads to several critical issues:
- Performance Degradation: Fetching thousands of rows can cause significant database load and slow API response times.
- High Memory Consumption: Loading all objects into memory on the server can lead to high RAM usage and potential service instability.
- Poor Client Experience: Clients (web or mobile frontends) are forced to download a massive payload, resulting in long loading times and a sluggish user interface.
- Lack of Scalability: The API in its current state will not scale as we add more data.
Describe the solution you'd like
I propose implementing offset-based pagination for all list endpoints. This will allow clients to request data in manageable "pages" or chunks.
Proposed Implementation Details:
-
API Query Parameters: The list endpoints should accept two new optional query parameters:
offset(integer): The number of items to skip from the beginning of the list. Defaults to0.limit(integer): The maximum number of items to return in the response. Defaults to a reasonable value like50.
Example Request:
GET /api/v1/items?offset=100&limit=25This would retrieve 25 items, starting after the first 100 items. -
Enforce a Maximum Limit: To protect the server from abusive requests, we should enforce a maximum value for the
limitparameter (e.g.,100). If a client requests alimitgreater than the maximum, the API should cap it at the maximum allowed value. -
Updated API Response Structure: The response body for paginated endpoints should be a JSON object that includes metadata about the pagination state, in addition to the data itself. This gives the client all the information needed to build pagination controls.
Example Response:
{ "total": 1250, "limit": 25, "offset": 100, "data": [ { "id": 101, "name": "Item 101", ... }, { "id": 102, "name": "Item 102", ... }, // ... up to 25 items ] }total: The total number of items available in the database that match the query.limit: The limit that was used for this request.offset: The offset that was used for this request.data: The array of items for the current page.
Describe alternatives you've considered
Cursor-based pagination was also considered. It uses a unique, opaque cursor (like an item's ID or timestamp) to fetch the next set of results. While this can be more performant for very large, real-time datasets, offset/limit pagination is simpler to implement, widely understood, and perfectly sufficient for our current needs. It provides a significant improvement over the current implementation.
Additional context
-
Affected Endpoints: This change should be applied to all major list endpoints, including (but not limited to):
/api/v1/users/api/v1/products/api/v1/orders
-
Implementation Strategy (Suggestion for FastAPI): We can create a reusable FastAPI Dependency to handle the pagination logic. This dependency would parse the
offsetandlimitquery parameters, apply defaults and maximums, and be easily injected into any path operation function that needs pagination. This promotes DRY (Don't Repeat Yourself) principles.Example of a dependency:
# In a new `dependencies.py` file from typing import Dict async def pagination_params(offset: int = 0, limit: int = 50) -> Dict[str, int]: # Enforce a max limit if limit > 100: limit = 100 return {"offset": offset, "limit": limit}Example usage in a route:
# In a router file from fastapi import APIRouter, Depends router = APIRouter() @router.get("/items") async def list_items(pagination: dict = Depends(pagination_params)): # Use pagination["offset"] and pagination["limit"] in the database query ...