digger [RFC] Pull-based jobs dispatch for integration with custom compute backends

[RFC] Pull-based jobs dispatch for integration with custom compute backends

Open ZIJ opened this issue 8 months ago • 0 comments

Why is this needed?

Currently, Digger orchestrator can only dispatch jobs to the CI backend (e.g. Actions) in a "push" fashion. With Actions, it calls the .../{workflow_id}/dispatches endpoint directly. This has the following drawbacks:

Integration with a different compute backend requires implementing a new CIService in the backend codebase. There is currently no way to integrate a "generic" CI backend. One way to solve it
Queuing and concurrency management logic has to be on Digger side. To some extent solved by max_concurrency but that's not sufficient for complex setups which may require their own job scheduling logic.

Proposed solution: expose queuing API

Digger orchestrator is already using Postgres as a queue for managing concurrency if the max_concurrency option is set. This way we are taking advantage of Postgres reliability, and not introducing a separate moving part like Redis or Kafka. Given that Digger is most often used in a self-hosted scenario, we are unlikely to ever hit performance / scalability limitations of Postgres.

Similar to google's pub-sub api

Create a subscription

POST /subscriptions/ body:

{
     topic: "jobs_myapp_prod"
}

returns 200:

{
    id: "mySubId"
}

Pull 1 job from a subscription

PATCH /subscriptions/{sub_id}

<empty body>

returns 200:

{
    jobSpec: <Digger Job Spec JSON to be passed to the CLI>
}

Why not simply `/jobs/pull`?

We need some sort of "narrowing down" on the scope of subscription; at least for prod / non-prod separation of executor pools, which might run on different infra, we need a way to differentiate "kinds" of jobs. Hence the concept of topics
Now that we have topics, "pulling" is actually modifying some kind of a resource. But not the topic! The topic stays the same; more like, modifying a "pool" of jobs relevant to the topic. Hence the concept of a subscription (same as in google's pub-sub)

May 29 '24 19:05 ZIJ

digger digger copied to clipboard

[RFC] Pull-based jobs dispatch for integration with custom compute backends

Why is this needed?

Proposed solution: expose queuing API

Create a subscription

Pull 1 job from a subscription

Why not simply /jobs/pull?

digger
digger copied to clipboard

Why not simply `/jobs/pull`?