Restate CLI
In order to provide a better user experience for our users we believe that a CLI is required. The CLI should enable our users to effectively interact with a Restate deployment. This includes among other things:
- Obtaining cluster information (ingress, REST endpoints)
- Interacting with service endpoints
- Registering new endpoints
- Inspect endpoints
- Interacting with services
- Inspect services
- Obtain versioning information
- Interacting with invocations
- Inspection of invocations and service state
- Repairing of state
- Detection of stuck invocations
- Killing/Cancellation of invocations
Ideally, we can evolve the Restate CLI also into a CLI used by our managed service.
### Tasks
- [ ] https://github.com/restatedev/restate/issues/889
- [ ] https://github.com/restatedev/restate/issues/890
- [ ] https://github.com/restatedev/restate/issues/924
- [ ] https://github.com/restatedev/restate/issues/938
- [ ] [Draft] Restate doctor: detect and report unhealthy invocations and/or service instances
- [ ] [Draft] Restate inspect: Grab details about any object given its ID or name
- [ ] https://github.com/restatedev/restate/issues/945
- [ ] https://github.com/restatedev/restate/issues/947
- [ ] https://github.com/restatedev/restate/issues/948
- [ ] https://github.com/restatedev/restate/issues/1076
- [ ] https://github.com/restatedev/restate/issues/926
- [ ] https://github.com/restatedev/restate/issues/892
- [ ] https://github.com/restatedev/restate/issues/997
- [ ] https://github.com/restatedev/restate/issues/1075
For the prototype we had build a cli tool https://github.com/restatedev/cli which did a number of things, we could also take some ideas from there (not necessarily code)
The CLI can be a one-stop shop for developers and operators using restate, I imagine a cli that guides users through the getting started journey for any new project and to be a close companion in development and production time.
@AhmedSoliman feel free to add the next steps for how to extend the current CLI to this umbrella issue (e.g. template project creation, introspection of state and journal, etc.).
Latest update on the plan:
- Ship the basic "catalog" experience of the CLI before EOY. (tier 1 features)
- The catalog experience includes discovery, list/describe services, basic status info of services (keyed and unkeyed), and same for endpoints, finding invocations that are not making progress and killing/cancelling them.
- The goal is to provide users with the basic tools they need to get through some of the potentially frustrating bits of their day0-1 experiments. Why is my invocation stuck, what's happening with a service, how busy is a service key, etc.
- not tackle the state/data introspection or patching in this iteration.
- Tentatively add support for "invoke" through the CLI, time permitting.
- Tier 2 features that might not make the cut line in this iteration are:
- Introspection of schema types and methods
- State/data introspection and patching
- Restate doctor: Runs a set of internal checks and runbooks to detect common gotchas and report to the user (dead locks, endpoint health, etc.)
- Restate inspect: Takes any id, service name, or grpc message type and prints information about it. This requires uniform internal ID schema (possibly the one proposed here)
- Before EOY, need support by adding or improving the set of datafusion/sql tables to support the CLI and we accept that the CLI will absorb the complexity of stitching information from various sources/queries to visualize.
- We agreed that the path for the future is a dedicated gRPC service "Cluster Admin Server" that provides high-fidelity APIs that supports CLI operations, and also acts as the SQL query server for the cluster. The SQL interface will be provided through the gRPC service as well. That said, this will not happen before EOY and the CLI will not wait for it, but its availability in the future is expected to dramatically improve the CLI experience by providing richer information that reduces CLI complexity.
Can we close this @tillrohrmann @AhmedSoliman and just rely on the single issues?
I am fine with closing this issue and moving the open issues into a follow-up umbrella issue.