vllm [Feature]: Return hidden states (in progress?)

[Feature]: Return hidden states (in progress?)

Open Elanmarkowitz opened this issue 7 months ago • 11 comments

🚀 The feature, motivation and pitch

I know this feature request sort of already exists: https://github.com/vllm-project/vllm/issues/5950 (and older, semi related requests) https://github.com/vllm-project/vllm/issues/3594 https://github.com/vllm-project/vllm/issues/1857

This is a similar pitch but I am creating a new issue as I noticed newer developments in the codebase. The pitch is to support returning hidden states when generating sequences. This enables many potential behaviors such as output classification, guardrails, etc. Whereas #5950 suggested a different step for embedding, I would suggest building it in as an option to EngineArgs or as an option that can be passed in with each generation request.

I see that in v0.5.1 there is already some new code in ModelDriverBase to support return_hidden_states. However, I don't see that supported yet in the LLM engine yet (not an input to EngineArgs). Basically, it seems like this feature is under development. I am mainly wondering what the timeline is for that? And what is the approach being taken so that I and the community can develop accordingly?

Alternatives

No response

Additional context

No response

Jul 06 '24 01:07 Elanmarkowitz

vllm vllm copied to clipboard

[Feature]: Return hidden states (in progress?)

🚀 The feature, motivation and pitch

Alternatives

Additional context

vllm
vllm copied to clipboard