jetstream-pytorch
jetstream-pytorch copied to clipboard
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Recently we added a new cli `jpt` (https://github.com/google/jetstream-pytorch/pull/178) that massively simplified the command line args the user need to specify. However, there are other commandline args that are optional but...
Small change that allows directly using the recently released DeepSeek R1 Distils. Tested on TPU v4-8 for "deepseek-ai/DeepSeek-R1-Distill-Llama-8B" and it worked.
Users can pass in custom sampling function now. But it's not per request sampling because it's hard to track all the sampling functions in the jitted function. It applies to...
Supports sampling from request. When user set sampling_algorithm to '', each request can send the sampler config which contains algorithm, temperature, topk, nucleus to enable different sampling strategy. We don't...