Liger-Kernel issues

GeGLU kernel numerical issue

### 🐛 Describe the bug The current Liger geglu kernel is not close to torch's implementation, even in fp32. `test_geglu.py` can't pass with lower tolerances. Some models couldn't pass the...

Tcc0403

Moe kernel for latest transformers v5

### 🐛 Describe the bug experts will no longer be just a ModuleList of MLP layers. It's time to write moe kernel for moe layers! https://github.com/huggingface/transformers/pull/41580/files#diff-0855b77fc27ad9449158a1c74953f909b011c00de7125f7c8e68d0ff209c092aR218 ### Reproduce _No response_...

Tcc0403

[Feature Request] Add SwiGLU support for Qwen3-VL models

### 🚀 The feature, motivation and pitch ### Problem Description Can we add **SwiGLU** kernel support for **Qwen3-VL** models in Liger-Kernel? Currently, it appears that passing `swiglu=True` in the `apply_liger_kernel_to_qwen3_vl`...

matthewdm0816

[RFC] Native Ascend NPU Support for Liger Kernel

1

## 1. Background & Motivation Ascend NPU is a default PyTorch device backend, natively compatible with ecosystems like Transformers, FlagGems, and Llama Factory. We’re also enabling Triton support (repo: [triton-ascend](https://gitcode.com/Ascend/triton-ascend))....

Ginray

Add `automatically-request-copilot-review.yaml` workflow

3

# Summary This PR adds a GitHub Actions workflow to automatically request Copilot code reviews for pull requests in **linkedin/Liger-Kernel**. ## Changes - Added `.github/workflows/automatically-request-copilot-review.yaml` - The workflow will automatically...

ChrisCarini

[GRPO] chunk over vocab without materializing logits

## Summary Updating the forward pass to compute only the required per-token log probabilities, simplifying the loss function interface, and adding comprehensive tests to ensure correctness against the Triton implementation:...

kashif

Tensor Parallel support

1

### 🐛 Describe the bug When enabling Tensor Parallelism with training, I get the following new error: ``` [rank1]: File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 287, in forward [rank1]: hidden_states = self.input_layernorm(hidden_states) [rank1]:...

winglian

high-priority

Maintainer List Should be updated

1

The current maintainer list in [contributing.md](https://github.com/linkedin/Liger-Kernel/blob/main/docs/contributing.md) seems to be outdated. What should it be updated towards to make it simpler for new contributors to tag? cc: @shimizust

ParagEkbote

Add GLM4_MOE model support

1

## Summary This PR adds support for GLM4.5 (GLM-4 MOE) models to the Liger Kernel #951 https://huggingface.co/zai-org/GLM-4.5 which share the same structure as GLM 4.6 ## Testing Done For the...

vvvdwbvvv

Add GLM-4.5 MoE

1

### 🚀 The feature, motivation and pitch Support for the GLM-4.5 and 4.6 family of models. @vvvdwbvvv ### Alternatives _No response_ ### Additional context _No response_

michaelroyzen

Liger-Kernel
Liger-Kernel copied to clipboard

Metadata

GeGLU kernel numerical issue

Moe kernel for latest transformers v5

[Feature Request] Add SwiGLU support for Qwen3-VL models

[RFC] Native Ascend NPU Support for Liger Kernel

Add `automatically-request-copilot-review.yaml` workflow

[GRPO] chunk over vocab without materializing logits

Tensor Parallel support

Maintainer List Should be updated

Add GLM4_MOE model support

Add GLM-4.5 MoE

← Metadata

Owner

Metadata

Liger-Kernel Liger-Kernel copied to clipboard

Metadata

← Metadata

Owner

Metadata

Liger-Kernel
Liger-Kernel copied to clipboard