Results 9 issues of Keshav Santhanam

The lease variables `duration` and `max_duration` are too ambiguous - rename these

The `Scheduler` and `Profiler` classes currently share a lot of code - we can factor this out into a common superclass (e.g. `SchedulerMechanism`)

Variable length generation requires `seq_idx` and `cu_seqlens` to span the full input sequence length but in some cases I would want to include padding tokens (e.g. for maintaining static shapes...

The outputs produced by variable-length generation (i.e., passing `seq_idx` and `cu_seqlens`) do not match the outputs produced by sequentially generating a single request at a time. I have included a...

Updates the text generation server to use the `DynamicInferenceCoordinator`.