jetstream-pytorch
jetstream-pytorch copied to clipboard
Add per request sampling support.
Supports sampling from request. When user set sampling_algorithm to '', each request can send the sampler config which contains algorithm, temperature, topk, nucleus to enable different sampling strategy. We don't have a good way to support random user provided sampling function yet due to the limitation of Jit compilation.
There will be a coming PR to enable it from JetStream side for the e2e workflow to work.