Murali Andoorveedu

Results 6 issues of Murali Andoorveedu

### 🚀 The feature, motivation and pitch # Make more operations inplace (GELU, BatchNorm, LayerNorm) ## **Summary** Hi PyTorch team, We would like to enable users to make the following...

module: nn
triaged
enhancement
needs research

This PR adds `send_object_list` and `recv_object_list` to `distributed_c10d.py`. This is extending functionality already present in PyTorch with `broadcast_object_list` that I noticed was missing and decided to upstream. With this change,...

oncall: distributed
module: cpu
triaged
module: mkldnn
open source
module: amp (automated mixed precision)
release notes: quantization
release notes: distributed (c10d)
module: inductor
module: dynamo
module: distributed_checkpoint

Adds initial pipeline parallelism support to vLLM. ToDo: Milestone 1: POC Prototype - [x] Make changes to support multiple schedulers and cache engines in `worker.py`, `llm_engine.py`, `async_llm_engine.py` and block managers....

This adds docs for pipeline parallel. cc: @simon-mo @youkaichao @njhill

Add all other possible models for PP FIX #7684

ready

Adding tests for V1 multimodal abort (as requested by @WoosukKwon) as well as load.

ready
v1