pytorch-operator
pytorch-operator copied to clipboard
PyTorchJob 1.0
Requirements
Configuration and deployment
| Description | Category | Status | Issue |
|---|---|---|---|
| Kustomize package | Required | Done | |
| Application CR | Required | Done | |
| Images listed in kustomization.yaml | Required | Done | |
| Upgradeability | Required | Done | |
| Separate cluster scoped and namespace scoped resources | Recommended | Not Done | #215 |
| Kustomize package should be deployable on its own | Recommended | Done |
Custom Resources
| Description | Category | Status | Issue |
|---|---|---|---|
| Version stability | Required | Done | |
| Backward compatibility | Required | Done | |
| Supports status subresource | Required | Done | |
| CRD schema validation | Required | Not Done | https://github.com/kubeflow/pytorch-operator/issues/183 |
| Training operators follow kubeflow/common conventions | Required | Done |
Logging and monitoring
| Description | Category | Status | Issue |
|---|---|---|---|
| Liveness/Readiness signals | Required | Done | |
| Prometheus metrics | Required | Done | |
| Json logging | Recommended | Done |
CI/CD
| Description | Category | Status | Issue |
|---|---|---|---|
| E2E tests | Required | Done | |
| Scalability / load testing | Required | Done | |
| Continuous building of docker images | Recommended | Done | |
| Continuous updating of Kustomize manifests | Recommended | Done |
Docs
| Description | Category | Status | Issue |
|---|---|---|---|
| API Reference docs | Required | Done | |
| Application docs | Required | Done |
Owners/Maintenance
| Description | Category | Explanation | Status | Issue |
|---|---|---|---|---|
| Healthy number of committers and commits | Required | Committers are listed as approvers in owners filesNumber to be determined by TOC based on size and scope of application | Done | |
| At least 2 different organizations are committers | Required | Google, CaiCloud, Cisco | Done |
Adoption
| Description | Category | Explanation |
|---|---|---|
| List of users running the application | Recommended | Suggest listing adopters willing to be identified publicly in ADOPTERS.md |
Maybe we should open an issue for Separate cluster scoped and namespace scoped resources, too.
/kind feature /area engprod /priority p2