tyger icon indicating copy to clipboard operation
tyger copied to clipboard

Support 1 million concurrent runs

Open johnstairs opened this issue 1 year ago • 1 comments

### Tasks
- [ ] #29
- [ ] Allow tagging runs
- [ ] Allow listing runs by status, buffers, and tags
- [ ] https://github.com/microsoft/tyger/issues/113
- [ ] Improve run sweeper scalability
- [ ] Allow buffers to be (bulk) deleted
- [ ] https://github.com/microsoft/tyger/issues/114
- [ ] Display buffer status on `tyger buffer show`
- [ ] Record time that a run started running
- [ ] Expose endpoint to show counts of runs by status (filtered by tags)
- [ ] Handle deadlocking that can occur with dependent runs
- [ ] Use [gang scheduling](https://kubedl.io/docs/training/gangscheduling/) for distributed runs to avoid deadlocks
- [ ] Set limit of number of Kubernetes jobs

johnstairs avatar Jun 06 '24 19:06 johnstairs

@hansenms @naegelejd: Using this to track the work we need to do.

johnstairs avatar Jun 06 '24 19:06 johnstairs