graph-node icon indicating copy to clipboard operation
graph-node copied to clipboard

graphman: create GraphQL API to execute commands

Open isum opened this issue 1 year ago • 2 comments

This PR creates a new graphman GraphQL API with initial support for deployment info, pause, resume and restart commands. It also provides the building blocks needed to support more complex graphman commands.

Part of #3729

Note

The GraphQL API will be extended incrementally, so there will be a transition period when graphman CLI modules will have weird type conversions and potentially duplicate code. That will go away when all commands are extracted into the core crate.

Todos

  • [x] Integrate extracted commands back into graphman CLI
  • [x] Add tests to the API using a real database connection
  • [x] Add documentation for the new graphman API
  • [x] Add support for long running graphman commands
  • [x] Fix metrics initialization error
  • [x] ~Extend the API with the remaining commands~ Will be in the scope of another PR
  • [x] Add documentation on how to extend the GraphQL API with new commands

isum avatar Jul 18 '24 15:07 isum

I noticed a few things in the GraphQL schema (which is well documented):

  • the schema has a QueryRoot and MutationRoot. IIRC, at least for queries, that must be called Query
  • for info to filter by namespace, you have to say info(deployment: { namespace: { namespace: "sgd86"}} ), i.e., mention namespace twice. Couldn't that just be info(deployment: { namespace: "sgd86"}) ? Similarly, when looking for a subgraph by name, that could be shortened a bit, too.
  • very minor, but I don't think the dpeloyment info should expose the id, that's an implementation detail.

lutter avatar Aug 21 '24 23:08 lutter

For the general approach how to expose the same business logic in the CLI and API, if you have a function biz that implements the business logic (I'll write as if this was a thunk, but it should probably be a trait), you want to do one of two things:

Run biz in the CLI

  • parameters come from the command line (clap) and possibly some general setup like the store
  • the function biz() should run synchronously, even if the command might take a long time
  • output should go straight to the terminal (ideally, there's access to stdout and stderr, though I don't think we use that today)
  • success/failure is indicated by the exit status
  • hitting ^C should immediately abort the command
  • being conscious about how much users have to type to run a command is a consideration

Run biz in the API

  • parameters come from the GraphQL query and might use existing setup like the store
  • the function biz() should run as part of a larger setup where a long-running biz() runs in the background, i.e., not synchronously with the GraphQL request.
  • output should be logged, though what should be logged might be different from what the CLI prints on the terminal
  • success/failure of the request just indicates whether the task to run biz() has been accepted

For the API case, it seems to me that it would be enough to have something that wraps biz() in the additional logic that background execution requires, something like

fn run(biz: Biz, ctx: Context) -> Result <_,_> {
  // The main thing that cmd_ctx allows is to change the status (or progress) of the command
  // for very long-running things
  // This also glosses over preventing duplicate execution, and how to tell generically whether two commands are the same
  // but that would happen here, too
  let cmd_ctx = add_row_to_graphman_command_executions(..)
  let res = biz.execute(ctx, cmd_ctx);
  cmd_ctx.command_finished();
  res
}

This might need to be split up a bit to make the right pieces async, but it could be fairly linear code with little need for extension points.

lutter avatar Aug 22 '24 00:08 lutter