ktransformers
ktransformers copied to clipboard
[ci]: add int4_1 & int4_1k
What does this PR do?
Fixes ci for int4_1 & int4_1k
Before submitting
- [x] Did you read the contributor guideline?
- [x] Did you write any new necessary tests?
Summary of Changes
Hello @KMSorSMS, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request enhances the continuous integration and development workflow by introducing two new performance benchmarks for AMX-accelerated INT4 Mixture-of-Experts (MOE) operations, including a variant with 1K group quantization. Concurrently, it refines the pre-commit hook to be more efficient and targeted, ensuring that only relevant staged files within the kt-kernel module are formatted, thereby streamlining the commit process.
Highlights
- Refactored Pre-Commit Hook: The pre-commit hook for the
kt-kerneldirectory has been significantly refactored to improve efficiency. It now intelligently identifies and formats only the staged C/C++ and Python files withinkt-kernel, rather than attempting to format broader directories or relying on build system targets. - Added INT4 AMX MOE Benchmark: A new benchmark script (
test_moe_amx_bench_int4_1.py) has been introduced to measure the performance (bandwidth and FLOPS) of AMX-accelerated INT4 Mixture-of-Experts (MOE) operations. This benchmark is integrated into the CPU CI pipeline. - Added INT4 1K Group AMX MOE Benchmark: Another new benchmark script (
test_moe_amx_bench_int4_1k.py) has been added to specifically evaluate the performance of AMX-accelerated INT4 MOE operations utilizing 1K group quantization. This also registers for CPU CI.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in pull request comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
[^1]: Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.