cortex
cortex copied to clipboard
Production infrastructure for machine learning at scale
#### Description If the `max_replicas` field is not set but the `min_replicas` field is, then default `max_replicas` to the value of `min_replicas`.
#### Description At bigger scales, the cluster runs out of IPs for its nodes/pods. #### Solutions 1. Use "Custom CNI networking", which basically adds another CIDR block to the VPC...
#### Description The user could want to do a regex instead of listing all of the node groups to use in the api spec: ```yaml # api spec # ......
#### Description The `cluster-autoscaler` can use up lots of memory when there are lots of nodes to keep track of or/and lots of nodes that need to be added. The...
#### Description Find a way to restrict the number of discovered targets per LB irrespective of the number of nodes that are being added when `cortex cluster`ing up. We don't...
#### Description Change to appropriate resource requests and/or limits for Neuron k8s device plugin. The AWS team has said their device plugin uses few CPU/Mem resources, but that's still not...
#### Description Assuming a new node group is to be added and another one is to be removed in one go with `cortex cluster configure`, take into consideration the temporary...
#### Description If a node group failed to be added to the cluster prior to running a successful subsequent `cortex cluster configure` on the same node group, then the pods...
### Description The prometheus and grafana EBS volumes created by the cluster are not tagged with Cortex's tags (default or user) ### Notes At this time, it does not seem...