TESK
TESK copied to clipboard
Implement resource limits
The task is about implementing quotas for groups/users as described in #9 (3 separate limits expressed in corehours/memGBhours/storageGB*hours). The quotas will reset/renew themselves monthly. Exceeding any of the quotas will stop new task submissions from the user/group members. We can later consider killing currently running jobs. Tasks:
- [x] - Calculating used resources monthly based on collecting all jobs for a user/group that ran in a given month and use resource requests (cpu/mem) and storage size to obtain metrics. Plan for metrics: cpu/mem = SUM(SUM(executor time in month * resource request)). storage SUM(PVC size * SUM(executor time in month))
- [x] - Exposing API presenting statistics
- [ ] - Introducing obligatory default values for cpu/mem requests
- [ ] - Config Maps with limits for groups and code to read those
- [ ] - Enforcing the limit by task submission
- [ ] - Caching of user/group jobs and cache eviction