Add Proper Pagination Support with New /v2 API

Open godber opened this issue 6 months ago • 0 comments

Problem

The current API implementation has incomplete pagination support that doesn't provide sufficient information for clients to paginate through large result sets:

No total count returned: API responses don't include the total number of records available, making it impossible to know how many pages exist
No metadata in responses: Clients receive only the raw data array without information about current page position, total results, or whether more results are available
Inconsistent pagination across endpoints: The /txt/* endpoints use slice() operations (api.ts:526, 539, 617) which don't respect the actual storage layer pagination, while JSON endpoints use storage pagination

Current Implementation

The getSearchOptions() function (api_utils.ts:117-122) extracts from, size, and sort from query parameters:

export function getSearchOptions(req: TerasliceRequest, defaultSort = '_updated:desc') {
    const sort = req.query.sort || defaultSort;
    const size = parseQueryInt(req, 'size', 100);
    const from = parseQueryInt(req, 'from', 0);
    return { size, from, sort };
}

These parameters are passed to storage layer search() methods, but responses don't include pagination metadata.

Affected Endpoints

JSON endpoints:

GET /jobs (api.ts:279)
GET /ex (api.ts:459)
GET /jobs/:jobId/errors (api.ts:442-457)

TXT endpoints:

GET /txt/workers (api.ts:512)
GET /txt/nodes (api.ts:530)
GET /txt/jobs (api.ts:550)
GET /txt/ex (api.ts:575)
GET /txt/slicers / GET /txt/controllers (api.ts:601)

Proposed Solution

Create a new /v2 API with proper pagination support while maintaining /v1 for backwards compatibility.

Wrap responses in a pagination envelope for JSON endpoints:

  {
    "data": [...],
    "pagination": {
      "total": 1500,
      "from": 0,
      "size": 100,
      "has_more": true
    }
  }

For txt endpoints: Add pagination info as footer/header text in table output
Add count queries: Modify storage layer to optionally return total counts alongside results (may need to use Elasticsearch's track_total_hits or separate count queries)
Consider adding: - page parameter as an alternative to from (calculate offset as page * size) - Response headers for pagination metadata (e.g., X-Total-Count, Link headers for next/prev pages) - Standardized error response format
Keep /v1 and /txt routes unchanged for backwards compatibility (api.ts:504-506)

Implementation Considerations

New v2 router: Create a separate v2routes Router alongside the existing v1routes (similar to structure at api.ts:203-506)
Code reuse: Share business logic between v1 and v2 endpoints, only changing response formatting
Performance: Count queries can be expensive on large indices; consider caching or approximations for very large datasets
Elasticsearch specifics: The storage layer uses Elasticsearch (elasticsearch_store.ts) which returns result counts in hits.total - this information should be passed through
Documentation: Clearly document the differences between v1 and v2 APIs, with migration guide
Deprecation path: Consider adding deprecation warnings to v1 responses (via headers or in response body) to encourage migration

Migration Path

Implement /v2 routes with new pagination format
Maintain /v1 routes with existing behavior
Update CLI and client libraries to support v2
Eventually deprecate v1 (with sufficient notice period)

References

api.ts:203-506 (v1 route setup)
api.ts:117-122 (getSearchOptions)
api.ts:279-293 (GET /jobs endpoint)
api_utils.ts:51-81 (handleTerasliceRequest)
elasticsearch_store.ts:240-278 (search method)

Oct 01 '25 23:10 godber