William Lin

Results 9 issues of William Lin

Hi, Will the driver for AXI HBICAP be added to this repo? If not, can someone please point to where I can find it. I'm using vitis+vivado to work with...

Hi, I'm wondering if the TFLOPs/MFU numbers in table 5 of the paper is using activation checkpointing? I've looked through the MS-AMP-Examples repo and it seems like GPT3 megatron does...

Adds initial multi step scheduling support to vLLM. RFC: https://github.com/vllm-project/vllm/issues/6854 **Current Status**: **8/8: multi-node working** 8/6: PP+TP working; PP+ray fixed; ~~a few single GPU perf regressions (easy fix)~~ 8/2 PP...

GPUExecutor has a different API and does not define a `_run_workers`. Another way to fix this would be define the `_run_workers` (it would only call the driver_worker) api in `GPUExecutor`...

ready

@WoosukKwon FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** ---...

ready

I don't have AMD GPUs and cannot test locally. We can also considering moving the `advance_step` inside flash_attn.py and rocm_flash_attn.py to `AttentionMetadata` as a default implementation since the code is...

**Why these changes are needed** -- We have created a set of custom ComfyUI nodes around [FastVideo](https://github.com/hao-ai-lab/FastVideo), a framework for multi-GPU video generation using sequence parallelism. This fixes `import` failures...

### Motivation Contritbutions are welcome! ## Focus - Diverse model support with training and inference - Launch RL training infra and an effective training recipe - New Distillation Recipes -...

## Description This PR adds a dummy `.result()` API to DeploymentResponseGenerator`. `DeploymentResponseGenerator` currently doesn't support `.result()`, however when calling `.remote()` on a DeploymentHandle, the return type is a Union of...

serve
go
unstale