teraslice icon indicating copy to clipboard operation
teraslice copied to clipboard

Add Proper Pagination Support with New /v2 API

Open godber opened this issue 6 months ago • 0 comments

Problem

The current API implementation has incomplete pagination support that doesn't provide sufficient information for clients to paginate through large result sets:

  1. No total count returned: API responses don't include the total number of records available, making it impossible to know how many pages exist
  2. No metadata in responses: Clients receive only the raw data array without information about current page position, total results, or whether more results are available
  3. Inconsistent pagination across endpoints: The /txt/* endpoints use slice() operations (api.ts:526, 539, 617) which don't respect the actual storage layer pagination, while JSON endpoints use storage pagination

Current Implementation

The getSearchOptions() function (api_utils.ts:117-122) extracts from, size, and sort from query parameters:

export function getSearchOptions(req: TerasliceRequest, defaultSort = '_updated:desc') {
    const sort = req.query.sort || defaultSort;
    const size = parseQueryInt(req, 'size', 100);
    const from = parseQueryInt(req, 'from', 0);
    return { size, from, sort };
}

These parameters are passed to storage layer search() methods, but responses don't include pagination metadata.

Affected Endpoints

JSON endpoints:

  • GET /jobs (api.ts:279)
  • GET /ex (api.ts:459)
  • GET /jobs/:jobId/errors (api.ts:442-457)

TXT endpoints:

  • GET /txt/workers (api.ts:512)
  • GET /txt/nodes (api.ts:530)
  • GET /txt/jobs (api.ts:550)
  • GET /txt/ex (api.ts:575)
  • GET /txt/slicers / GET /txt/controllers (api.ts:601)

Proposed Solution

Create a new /v2 API with proper pagination support while maintaining /v1 for backwards compatibility.

  1. Wrap responses in a pagination envelope for JSON endpoints:
  {
    "data": [...],
    "pagination": {
      "total": 1500,
      "from": 0,
      "size": 100,
      "has_more": true
    }
  }
  1. For txt endpoints: Add pagination info as footer/header text in table output
  2. Add count queries: Modify storage layer to optionally return total counts alongside results (may need to use Elasticsearch's track_total_hits or separate count queries)
  3. Consider adding: - page parameter as an alternative to from (calculate offset as page * size) - Response headers for pagination metadata (e.g., X-Total-Count, Link headers for next/prev pages) - Standardized error response format
  4. Keep /v1 and /txt routes unchanged for backwards compatibility (api.ts:504-506)

Implementation Considerations

  • New v2 router: Create a separate v2routes Router alongside the existing v1routes (similar to structure at api.ts:203-506)
  • Code reuse: Share business logic between v1 and v2 endpoints, only changing response formatting
  • Performance: Count queries can be expensive on large indices; consider caching or approximations for very large datasets
  • Elasticsearch specifics: The storage layer uses Elasticsearch (elasticsearch_store.ts) which returns result counts in hits.total - this information should be passed through
  • Documentation: Clearly document the differences between v1 and v2 APIs, with migration guide
  • Deprecation path: Consider adding deprecation warnings to v1 responses (via headers or in response body) to encourage migration

Migration Path

  1. Implement /v2 routes with new pagination format
  2. Maintain /v1 routes with existing behavior
  3. Update CLI and client libraries to support v2
  4. Eventually deprecate v1 (with sufficient notice period)

References

  • api.ts:203-506 (v1 route setup)
  • api.ts:117-122 (getSearchOptions)
  • api.ts:279-293 (GET /jobs endpoint)
  • api_utils.ts:51-81 (handleTerasliceRequest)
  • elasticsearch_store.ts:240-278 (search method)

godber avatar Oct 01 '25 23:10 godber