vllm
vllm copied to clipboard
[Neuron] Add an option to build with neuron
This PR adds an option that setup vLLM to build with Neuron toolchain (include neuronx-cc and transformers-neuronx).
This would help us build
vllm-0.2.3+neuron211
, where the neuron version comes out of the compiler version.
This is part of the effort to add support to accelerate LLM inference with Trainium/Inferentia (see #1866) .