Zhanghao Wu

Results 222 issues of Zhanghao Wu

Tested (run the relevant ones): - [ ] Code formatting: install pre-commit (auto-check on commit) or `bash format.sh` - [ ] Any manual or new tests for this PR (please...

A user encountered an issue when a GKE has a node pool with instance type that does not appear in the GCP catalog, with the `autoscaler: gke` set in config....

[https://www.notion.so/Release-Process-2c4d3d4bb77480b9be52ec835111b2e6](https://www.notion.so/Release-Process-2c4d3d4bb77480b9be52ec835111b2e6)

> We mainly want to be able to specify something like a "job group" in a single YAML file and launch/stop it with a single command line. Each job in...

This is an example to enable large-scale parallel model evaluation with SkyPilot + Promptfoo ![](https://i.imgur.com/BskVWdn.png) ![](https://i.imgur.com/ptuYADo.png) ### Why SkyPilot? **SkyPilot automates the complex infrastructure setup** needed for large-scale model evaluation:...

Stale

## Summary - format CPU values in resource strings to preserve fractional requests - add a unit test ensuring fractional Kubernetes CPU requests display correctly ## Testing - pytest tests/unit_tests/test_sky/utils/test_cli_utils.py...

codex

When uploading files to the API server, the httpx client was shared across multiple parallel upload threads, causing SSL_ALERT_BAD_RECORD_MAC errors due to corrupted SSL state from concurrent connection reuse. This...

Stale

Changing the context name in `~/.kube/config` causes using volumes fail.

Stale

``` file_mounts: /buckets/my-models: name: skypilot-agent-models store: s3 mode: MOUNT_CACHED ``` Trying the above on a GKE cluster, and getting the following error with the `kubectl logs` ``` Installing missing packages...

Stale

Move our examples that has long setup time to use `uv` to reduce cold start time.

documentation
good first issue
Stale