ktransformers icon indicating copy to clipboard operation
ktransformers copied to clipboard

[ci]: add int4_1 & int4_1k

Open KMSorSMS opened this issue 3 days ago • 1 comments

What does this PR do?

Fixes ci for int4_1 & int4_1k

Before submitting

KMSorSMS avatar Dec 01 '25 09:12 KMSorSMS

Summary of Changes

Hello @KMSorSMS, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the continuous integration and development workflow by introducing two new performance benchmarks for AMX-accelerated INT4 Mixture-of-Experts (MOE) operations, including a variant with 1K group quantization. Concurrently, it refines the pre-commit hook to be more efficient and targeted, ensuring that only relevant staged files within the kt-kernel module are formatted, thereby streamlining the commit process.

Highlights

  • Refactored Pre-Commit Hook: The pre-commit hook for the kt-kernel directory has been significantly refactored to improve efficiency. It now intelligently identifies and formats only the staged C/C++ and Python files within kt-kernel, rather than attempting to format broader directories or relying on build system targets.
  • Added INT4 AMX MOE Benchmark: A new benchmark script (test_moe_amx_bench_int4_1.py) has been introduced to measure the performance (bandwidth and FLOPS) of AMX-accelerated INT4 Mixture-of-Experts (MOE) operations. This benchmark is integrated into the CPU CI pipeline.
  • Added INT4 1K Group AMX MOE Benchmark: Another new benchmark script (test_moe_amx_bench_int4_1k.py) has been added to specifically evaluate the performance of AMX-accelerated INT4 MOE operations utilizing 1K group quantization. This also registers for CPU CI.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

[^1]: Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot] avatar Dec 01 '25 09:12 gemini-code-assist[bot]