star comments

Results 12 comments of


                                            star

clip加速10倍的测试代码是 ./example/python/models/clip_transformers_example.py 吗？我使用这个代码测试，只加速2点几倍

你好，您可以提供一下运行代码的硬件信息，还有torch、transformers的版本信息吗

clip加速10倍的测试代码是 ./example/python/models/clip_transformers_example.py 吗？我使用这个代码测试，只加速2点几倍

这是在2080ti上的运行结果，你可以参考一下 transformers==4.19.0 torch==1.12.0 ![image](https://user-images.githubusercontent.com/30221696/181414465-92178905-dc47-49ae-b42e-346b2c3296aa.png) 此外，将测试代码的using_half参数设置成True使用eet fp16推理可以获得更好的加速效果

clip加速10倍的测试代码是 ./example/python/models/clip_transformers_example.py 吗？我使用这个代码测试，只加速2点几倍

> NVIDIA A100-SXM torch: 1.10.1+cu111 transformers: 4.20.1 cuda：11.1 cudatoolkit：11.3.1 cudnn：8.0.4 Driver Version: 515.48.07 非常感谢您及时回复，以上就是我用的环境信息，您可以提供一下您刚刚这个结果所使用的详细的硬件信息吗 NVIDIA GeForce RTX 2080 Ti cuda: 11.6 cudnn: 8.3.3 Driver version: 470.82.01

clip加速10倍的测试代码是 ./example/python/models/clip_transformers_example.py 吗？我使用这个代码测试，只加速2点几倍

> 非常感谢反馈，有任何问题我们随时沟通

Fasttransformer in GLM-130b

同问，chatglm-6B版本模型的qkv多头顺序和标准glm模型不同，是否有适配版本

pip install build. error

Could you please provide the environment you use？From your information，the trust library is not installed correctly which is included in cuda. Also，we recommend using the dockerfile in the EET repository...

pip install build. error

@520jefferson Which version is nvcc or cuda？Do torch and transformers work properly?

pip install build. error

@520jefferson can you provide your dockerfile, we will try to figure out what goes wrong

pip install build. error

Thanks for the feedback, if you have any questions, contact with us.

Performance for Llama-2-7b in fp8 is lower than benchmark

@byshiue Thanks for your reply. I tried adding parameters`--kv_cache_dtype fp8`, but the performance didn't seem to improve. ``` python quantize.py --model_dir ${WORK_HF_DIR} \ --dtype float16 \ --qformat fp8 \ --output_dir...