cortex icon indicating copy to clipboard operation
cortex copied to clipboard

Production infrastructure for machine learning at scale

Results 121 cortex issues
Sort by recently updated
recently updated
newest added
trafficstars

#### Description Currently requests are assigned to replicas at random. A smarter approach would be to assign based on least recently accessed (i.e. strict ordering), smallest queue size, or something...

enhancement

#### Description Currently, `kubectl` is required to use a private docker registry (see here: https://docs.cortex.dev/guides/private-docker). #### Possible designs * Add a CLI command, e.g. `cortex cluster docker-login` * Add `docker_registry_username`...

enhancement

#### Description Add a timeout to CLI requests to the operator. Here are some potential reasons for why requests may hang (verify and address): * operator doesn't exist (crashed and...

enhancement

Currently, if you install the CLI on a new machine and use different AWS credentials (with the `AdministratorAccess` IAM policy attached), running `cortex cluster` commands will not work. We link...

research

#### Description Add a command (e.g. `cortex cluster list`) which lists all cortex clusters. Include clusters that are running or are in an unexpected/incomplete states. If it's simple, there could...

enhancement

#### Description When the cluster up fails to bring up spot nodes (as a requirement for the autoscaler to infer the instances' resource specs), the advertised memory of a node...

bug

When creating a cluster which uses 100% spot instances (with or without `on_demand_backup`) and has `min_instances` > 0, if spot instances are not available, cluster creation hangs in `eksctl`. The...

bug

#### Reproduction steps 1. Create a cluster with `instance_type: inf1.6xlarge`, `min_instances: 1`, and `max_instances: 2` 1. Add `min_replicas: 4` to `examples/tensorflow/image-classifier-resnet50/cortex_inf.yaml`, and `cortex deploy` it 1. Wait for 4 replicas...

research

#### Description Currently a GET call on the Job displays the following fields specified in the job submission: * config * workers Keys such as `delimited_files`, `file_path_lister` and `item_list` are...

enhancement

#### Description As of v0.19, job submissions are validated with custom code rather than using the config reader. Config reader wasn't used because it doesn't support `json.RawMessage` (or just `[]byte`)....

enhancement
refactor