nm-vllm icon indicating copy to clipboard operation
nm-vllm copied to clipboard

A high-throughput and memory-efficient inference and serving engine for LLMs

Results 32 nm-vllm issues
Sort by recently updated
recently updated
newest added

To be merged after upstream sync This PR does two things: - a) changes the set of tests that we run on remote push - b) converts to using environment...

## Notes This PR is a work in progress and based off of: https://github.com/vllm-project/vllm/pull/6396 so that will have to land before this. ## Description This PR introduces a spiritual successor...