JetStream icon indicating copy to clipboard operation
JetStream copied to clipboard

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

Results 35 JetStream issues
Sort by recently updated
recently updated
newest added

https://github.com/AI-Hypercomputer/JetStream/blob/main/docs/observability-prometheus-metrics-in-jetstream-server.md only mentions the `prometheus_port` if the jetstream runs with `maxengine_server`. However, such option doesn't exist under https://github.com/AI-Hypercomputer/jetstream-pytorch. I wonder if `jetstream-pytorch` also expose prometheus metrics related to the inference...

Could someone explain or point to a doc that explains how MOE is implemented on Jetstream? Specifically, the all-to-all communications, static vs dynamic, sparse matmuls. I would like to understand...

Any Plan to Support TPU V3? Kaggle has been offering TPU V3 and it is a good learning & testing ground for TPU related release.

gcloud setup has been failing in Dockerfile. Tried previously - pip install gcloud-cli-sdk install In the PR - Added changes similar to Maxtext- https://github.com/AI-Hypercomputer/maxtext/blob/4ac910d3435c75ce3f922459c71181068d1d5e4e/maxtext_gpu_dependencies.Dockerfile#L14C1-L22C51

pull ready

The most exciting PR you'll receive this decade