William Lin issues

Results 9 issues of


                                            William Lin

Driver for AXI HBICAP?

Hi, Will the driver for AXI HBICAP be added to this repo? If not, can someone please point to where I can find it. I'm using vitis+vivado to work with...

Is activation checkpointing used for Table 5 from the FP8-LM paper?

Hi, I'm wondering if the TFLOPs/MFU numbers in table 5 of the paper is using activation checkpointing? I've looked through the MS-AMP-Examples repo and it seems like GPT3 megatron does...

Adds initial multi step scheduling support to vLLM. RFC: https://github.com/vllm-project/vllm/issues/6854 **Current Status**: **8/8: multi-node working** 8/6: PP+TP working; PP+ray fixed; ~~a few single GPU perf regressions (easy fix)~~ 8/2 PP...

[bugfix] torch profiler bug for single gpu with GPUExecutor

GPUExecutor has a different API and does not define a `_run_workers`. Another way to fix this would be define the `_run_workers` (it would only call the driver_worker) api in `GPUExecutor`...

ready

[multi-step] add flashinfer backend

@WoosukKwon FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) **BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE** ---...

ready

[bugfix] [AMD] add multi-step advance_step to ROCmFlashAttentionMetadata

I don't have AMD GPUs and cannot test locally. We can also considering moving the `advance_step` inside flash_attn.py and rocm_flash_attn.py to `AttentionMetadata` as a default implementation since the code is...

Fix custom node blocker for multi-GPU inference using multi-processing

**Why these changes are needed** -- We have created a set of custom ComfyUI nodes around [FastVideo](https://github.com/hao-ai-lab/FastVideo), a framework for multi-GPU video generation using sequence parallelism. This fixes `import` failures...

[Feature] Development Roadmap 2025 Q4/2026 Q1

### Motivation Contritbutions are welcome! ## Focus - Diverse model support with training and inference - Launch RL training infra and an effective training recipe - New Distillation Recipes -...

[serve] add result(...) to DeploymentResponseGenerator to fix static typing

## Description This PR adds a dummy `.result()` API to DeploymentResponseGenerator`. `DeploymentResponseGenerator` currently doesn't support `.result()`, however when calling `.remote()` on a DeploymentHandle, the return type is a Union of...

serve

unstale