cortex icon indicating copy to clipboard operation
cortex copied to clipboard

Production infrastructure for machine learning at scale

Results 121 cortex issues
Sort by recently updated
recently updated
newest added
trafficstars

#### Description If the `max_replicas` field is not set but the `min_replicas` field is, then default `max_replicas` to the value of `min_replicas`.

#### Description At bigger scales, the cluster runs out of IPs for its nodes/pods. #### Solutions 1. Use "Custom CNI networking", which basically adds another CIDR block to the VPC...

performance

#### Description The user could want to do a regex instead of listing all of the node groups to use in the api spec: ```yaml # api spec # ......

enhancement

#### Description The `cluster-autoscaler` can use up lots of memory when there are lots of nodes to keep track of or/and lots of nodes that need to be added. The...

performance

#### Description Find a way to restrict the number of discovered targets per LB irrespective of the number of nodes that are being added when `cortex cluster`ing up. We don't...

enhancement
research

#### Description Change to appropriate resource requests and/or limits for Neuron k8s device plugin. The AWS team has said their device plugin uses few CPU/Mem resources, but that's still not...

refactor
blocked

#### Description Assuming a new node group is to be added and another one is to be removed in one go with `cortex cluster configure`, take into consideration the temporary...

bug

#### Description If a node group failed to be added to the cluster prior to running a successful subsequent `cortex cluster configure` on the same node group, then the pods...

bug

### Description The prometheus and grafana EBS volumes created by the cluster are not tagged with Cortex's tags (default or user) ### Notes At this time, it does not seem...

bug