agenta icon indicating copy to clipboard operation
agenta copied to clipboard

[AGE-275] [Evaluations] cancel / terminate evaluation

Open aybruhm opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe. We want to be able to cancel a currently running evaluation job by specifying the evaluation ID and job ID.

Describe the solution you'd like

  • Update the create evaluation endpoint to return the evaluation ID and the unique job ID
  • Implement the necessary business logic to stop a running job in the evaluation_service
  • Implement a POST endpoint that calls the business logic to stop the running job

Additional context

  • What is the time frame for allowing a user to revoke a running job?
  • What should be done if a user revokes a running job that has created resources already?

-> It seems like canceland terminate are semantically different (?) If we decide to implement this feature, let's just pick one term.

-> performance improvement

Motivations for having cancel

  • The job is taking too long for some reason and I, as a user, don't know why and don't care why. I just want to cancel it.
  • Something failed in the evaluation backend and the evaluation frontend will never know. I, as a user, just want to sync/reset the frontend, to avoid seeing an infinite timer.
  • I, as a user, launched an evaluation but I realized that I made a mistake, or missed something, and I want to cancel it to make changes first.

Downsides of not having cancel

  • As a user, infinitely running evaluations are unstoppable, ugly, and stressful.
  • As a user, a feeling of wasteful, useless compute for unwanted evaluations.
  • As Agenta, truly wasting useless compute for unwanted evaluations, in cloud.

Considerations for cancel

  • Keeping the router, service, database, queue, and workers in sync.
  • Handling side-effects of running an evaluation, gracefully.

Elements required for cancel

  • A new EVALUATION_CANCELLED status
  • A handle to the JOD_ID in the Evaluation database.
  • Update the evaluation_router to allow for a cancel command.
  • Update the evaluation_service to handle cancelling jobs.
  • Update the frontend to handle cancelled evaluations.
  • (Eventually, for audit logging) Add Jobs to the database.

From SyncLinear.com | AGE-275

aybruhm avatar Jan 22 '24 15:01 aybruhm

Closing this issue as completed will resolve the first issue addressed here.

aybruhm avatar Jan 22 '24 15:01 aybruhm