Provide patterns and helpers for long-running tools (jobs, polling, callbacks)

Open dgenio opened this issue 3 months ago • 0 comments

Description

Summary

The current tooling model assumes that tools complete within a single request/response cycle. For long-running operations (e.g. batch processing, model training, large file processing), this is not always practical.

While progress notifications exist, they require a live connection for the entire duration, and there is no first-class support for job IDs, polling, or callbacks.

Proposal

Document recommended patterns for long-running work
- “Job” tools that:
  - accept parameters,
  - enqueue work,
  - return a job_id immediately.
- Companion tools/resources that:
  - report job status (get_job_status(job_id)),
  - retrieve results (get_job_result(job_id)).
Consider small helpers
- Optional helper utilities for implementing job queues with common patterns (e.g., in-memory + pluggable backend).
- Clear separation between SDK responsibilities and user’s job infrastructure.
Integrate with progress notifications where applicable
- Clarify how progress notifications interact with job-style tools vs single-call tools.

Why this matters

User experience: Clients and LLM agents can provide feedback and not block on long-running operations.
Reliability: Network glitches or timeouts do not require restarting long jobs from scratch.
Flexibility: Encourages patterns that map cleanly onto external schedulers/queues.

Acceptance criteria

[ ] Documentation includes a “long-running tools” section with job/polling patterns.
[ ] At least one example demonstrates a job-style tool pattern using the SDK.
[ ] It is clear how progress notifications can be combined with these patterns.

References

No response

Nov 28 '25 15:11 dgenio