Liangfu Chen
Liangfu Chen
Current `riscv-mini` implements RV32I of the User-level ISA Version 2.0 and the Machine-level ISA of the Privileged Architecture Version 1.7. This requires the users to build and install RISC-V tools...
- scale timeline by 0.5 - allow displaying more than 24 months
## Proposal We propose to integrate transformers-neuronx to be the execution engine in vLLM for supporting LLM inference on Inferentia. This would require changes on both transformers-neuronx and vLLM. ###...
This PR adds an option that setup vLLM to build with Neuron toolchain (include neuronx-cc and transformers-neuronx). This would help us build ``` vllm-0.2.3+neuron211 ``` , where the neuron version...
### Purpose This issue is intended for tracking the effort in enabling TorchScript for all supported model in MultiModalPredictor. Enabling TorchScript would help us to leverage OnnxRuntime and OpenVINO for...
FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** --- PR...
As part of the effort in supporting vLLM V1 architecture for neuron backend (https://github.com/vllm-project/vllm/issues/11152), this PR intent to support activation, layernorm, rotary_embedding and logits_processor as a variant of existing modules...
With base=0, the `cos_sin_cache` would be initialized with NaN: ``` self.cos_sin_cache tensor([[ 1.0000, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, 0.0000, nan,...