Baizhou Zhang

Results 16 issues of Baizhou Zhang

### 📚 The doc issue In the "Train with real prompt data" part of ColossalAI/applications/ChatGPT/examples/README.md, `torchrun --standalone --nproc_per_node=2 train_prompts.py prompts.csv --strategy colossalai` should be replaced by `torchrun --standalone --nproc_per_node=2 train_prompts.py...

documentation

## 📌 Checklist before creating the PR - [x] I have created an issue for this PR for traceability - [x] The title follows the standard format: `[doc/gemini/tensor/...]: A concise...

support gradient accumulation for hybrid parallel plugin (through implementing no_sync method for plugin) relevant issue: #4776

### PR Category CINN ### PR Types Improvements ### Description Reverse the order of `cinn_group_cluster_pass` and `add_store_in_fusion_pass`, with the purpose of simplifying fusion process in group cluster pass. PCard-76996

### Checklist - [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. - [x]...

feature
lora

## Motivation #3323 [Grouped Gemm kernel](https://developer.nvidia.com/blog/introducing-grouped-gemm-apis-in-cublas-and-more-performance-updates/) added in Cublas 12.5 is useful. It can be applied to MoE EP layer/Lora layer for acceleration. ## Modifications - Add `cublas_grouped_gemm` in sgl-kernel...

### Checklist - [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. - [x]...

help wanted
lora

## Motivation `flashinfer_backend.py` for attention is too complex, this PR extract the logic of MLA and creates a new `flashinfer_mla_backend.py` ## Modifications - Define `FlashInferMLAAttnBackend` in `flashinfer_mla_backend.py` by removing codes...

high priority

### Checklist - [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. - [x]...

enhancement
lora