xFasterTransformer
xFasterTransformer copied to clipboard
xft version:1.8.2 lscpu: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 52 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel...
Bumps [gradio](https://github.com/gradio-app/gradio) from 4.37.2 to 5.0.0. Release notes Sourced from gradio's releases. [email protected] Features #8843 6f95286 - No token passed by default in gr.load() #8843 6f95286 - Adding new themes...
when I build the source code on ubuntu 22.04, I have a issue as below, how do I fix it ? BTW, I used main branch. HEAD is e73e4c1ac03f44fe986f34c01bb345e8bc5409b4 ```...
I want to know that when I use xFT to test the qwen3-8B model, the dtype is bf16 and the kv cache is set to fp16. I would like to...
when i test the model "DeepSeek-R1-Distill-Qwen-7B", the TTFT metrix worse than openvino,I don't know if it's normal. If so, is there any way to improve this performance  Environment: CPU:2x8592+...
I have 2 8592+ EMR CPU,and have four node in my system, when runing run_benchmark.sh script with "-s" parameter, the program justice to create four process as follows:  the...
1. I test only use one node 0 to test qwen3-8B "numactl --all -C 0-31 -m 0 python /home/tzk/AI_Test/xFasterTransformer/benchmark/benchmark.py --model_name qwen3-4B --token_path /data/Model_File/Qwen3-4B --model_path /data/Model_File/Qwen3-4B-xft --prompt_path /home/tzk/AI_Test/xFasterTransformer/benchmark/prompt.json --batch_size 2 --iteration...
Build fails. ``` $ pip install . -v Collecting cmake Using cached https://mirror.nju.edu.cn/pypi/web/packages/91/96/2671d7f3612c4449affc956542b25d9193efd8026dbc8ab6b3498f5cede3/cmake-4.0.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.9 MB) ... CMake Error at CMakeLists.txt:15 (cmake_minimum_required): Compatibility with CMake < 3.5 has been removed from...
When running vllm serving with 16 threads using the model DeepSeek-Distill-Qwen-7b, the result is wrong with the prompt below. xfastertransformer 1.8.2. vllm-xft 0.5.5.0 The result is correct while running 12...