text-generation-inference
text-generation-inference copied to clipboard
[Prototype] Vectorized causal lm
Prototype to greatly reduce the post-processing overhead at higher batch sizes.