afeldman-nm
afeldman-nm
FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** --- PR...
FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** --- PR...
vLLM currently supports decoder-only models. This PR - Adds support for T5 (encoder/decoder model) using the latest vLLM Attention wrapper - Augments SequenceGroups with optional encoder sequence representations - Modifies...
GOALS • Whisper support • Exemplifies encoder/decoder (E/D) support • E/D K/V caching • E/D parallelism TESTING • HuggingFace whisper model • Replicate public English Speech Recognition (SR) test using...
FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** --- PR...
This PR is a step towards encoder/decoder model support. This PR (1) allows a SequenceGroup to be associated with 0 or 1 encoder sequences, and (2) causes an encoder/decoder model...
This PR is a step towards encoder/decoder model support. This PR creates a specialized ModelRunner subclass for encoder/decoder models; it differs from the base ModelRunner class primarily in that it...
This PR makes QKV computation more efficient in the case of cross-attention layers. `QKVParallelLinear` only performs self-attention QKV computation against a single hidden-state input from the previous decoder layer. Cross-attention...
This issue is not in response to a performance regression. The method of performing cross-attention QKV computations introduced in #4942 could be improved. Because this issue relates to cross-attention, it...
## Motivation # There is significant interest in vLLM supporting encoder/decoder models. Issues #187 and #180 , for example, request encoder/decoder model support. As a result encoder/decoder support was recently...