dingbaorong
dingbaorong
We used https://qianwen-res.oss-cn-beijing.aliyuncs.com/profile.py to reproduce the results. Here are the results on nvidia's gpu: the memory usage reported by `torch.cuda.max_memory_allocated` matches the official report. But the memory usage reported by...
1. The performance of Qwen-14B-Chat on our machines is good. Here is our configuration: **Machine**: i9 14900K; arc A770; 64 GB mem DDR5 (Linux) **bigdl's version**: 2.5.0b20231213 **Kernel version**: 5.19.0-41-generic...
Here is how to downgrade linux kernel. https://github.com/intel-analytics/bigdl-llm-tutorial/blob/main/ch_6_GPU_Acceleration/environment_setup.md
We failed to reproduce this problem on our machine (max1100) Environments: ``` bigdl's version: 2.5.0b20240118 [opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Data Center GPU Max 1100 OpenCL 3.0 NEO [23.30.26918.50] [ext_oneapi_level_zero:gpu:0]...