Performance of lib-caffe-bvlc-clblast-master-gcc-5.4.0-linux-32 on Odroid XU4
Hello,
I have 2 Odroid XU4 boards and I have both installed lib-caffe-bvlc-clblast-master-gcc-5.4.0-linux-32 library, but the first one was installed old version and the second one was installed recent version. And I found that the old one has better performance compared to the new one as below:
The new one:

The old one:

Could you please tell me why this issue occurred?
Best regards
@dohai90 There could be many reasons for the regression but let's start from the most obvious ones:
- CLBlast has changed. When did you install your first (old) version?
- Is the result repeatable? How do you run the program? If you use
ck run program:..., you do not fix the CPU and GPU frequency, so they may change during execution. Usingck benchmark program:...is more reliable.
Also, after using ck benchmark program (it's a more high-level "pipeline" which attempts to set up and monitor CPU/GPU frequency, etc)) in both cases, please provide the log. It will help us see the resolved dependencies and their versions ... Thanks!
Hello, @psyhtest
- I installed old version 2 or 3 months ago.
- First, I used
ck run program:caffeand get the above results. After usingck benchmark program:caffethey give similar performance.
@gfursin
I attached here the log of both cases.
As I understand, the benchmark utility sets max frequency for both CPU and GPU, am I right?
Although the results from both cases are similar by using benchmark utility, I run the same program on 2 boards but the execution times are still different while I have set max frequency for both CPU and GPU via command:
./CK/ck-env/platform.init/generic-odroid/ck-set-cpu-performance
and:
./CK/ck-env/platform.init/generic-odroid/ck-set-gpu-performance
new_device_log.txt
old_device_log.txt
Could you give me any advice why the same program runs on 2 boards which have been set max frequency but still results in different execution time? If you need I will upload my program here.
Thank you
If I interpret the logs correctly, the execution time is about 67 seconds in both cases? Am I missing something?
On Mon, 4 Dec 2017 at 04:19, Trunghai Do [email protected] wrote:
Hello, @psyhtest https://github.com/psyhtest
- I installed old version 2 or 3 months ago.
- First, I used ck run program:caffe and get the above results. After using ck benchmark program:caffe they give similar performance.
@gfursin https://github.com/gfursin I attached here the log of both cases. As I understand, the benchmark utility sets max frequency for both CPU and GPU, am I right? Although the results from both cases are similar by using benchmark utility, I run the same program on 2 boards but the execution times are still different while I have set max frequency for both CPU and GPU via command: ./CK/ck-env/platform.init/generic-odroid/ck-set-cpu-performance and: ./CK/ck-env/platform.init/generic-odroid/ck-set-gpu-performance new_device_log.txt https://github.com/dividiti/ck-caffe/files/1525640/new_device_log.txt old_device_log.txt https://github.com/dividiti/ck-caffe/files/1525641/old_device_log.txt
Could you give me any advice why the same program runs on 2 boards which have been set max frequency but still results in different execution time? If you need I will upload my program here.
Thank you
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/dividiti/ck-caffe/issues/123#issuecomment-348857008, or mute the thread https://github.com/notifications/unsubscribe-auth/AGSsuux5i9Sv-TB5S74q_XXE2-KU_UhAks5s83K7gaJpZM4QyMDw .
@psyhtest Yes, you are right, but that is the benchmark program. However when I run the same network architecture using lib-caffe-bvlc-clblast-master-gcc-5.4.0-linux-32 framework, the new one is slower even though I have set max frequency for CPU and GPU on both boards, it's really strange.
Also, from your logs it seems that you use different versions of CK (or at
least repos). Can you please update by ck pull all. I remember fixing
some problems with scripts when you had to change to the directory with
scripts before running them, otherwise they wouldn’t run properly...
If you still have issues, please run ck-print-cpu-freq
(ck-print-gpu-freq) after ck-set-cpu-performance
(ck-set-gpu-performance) to check that the CPU (GPU) frequency has been
set correctly.
On Mon, 4 Dec 2017 at 07:37, Anton Lokhmotov [email protected] wrote:
If I interpret the logs correctly, the execution time is about 67 seconds in both cases? Am I missing something?
On Mon, 4 Dec 2017 at 04:19, Trunghai Do [email protected] wrote:
Hello, @psyhtest https://github.com/psyhtest
- I installed old version 2 or 3 months ago.
- First, I used ck run program:caffe and get the above results. After using ck benchmark program:caffe they give similar performance.
@gfursin https://github.com/gfursin I attached here the log of both cases. As I understand, the benchmark utility sets max frequency for both CPU and GPU, am I right? Although the results from both cases are similar by using benchmark utility, I run the same program on 2 boards but the execution times are still different while I have set max frequency for both CPU and GPU via command: ./CK/ck-env/platform.init/generic-odroid/ck-set-cpu-performance and: ./CK/ck-env/platform.init/generic-odroid/ck-set-gpu-performance new_device_log.txt https://github.com/dividiti/ck-caffe/files/1525640/new_device_log.txt old_device_log.txt https://github.com/dividiti/ck-caffe/files/1525641/old_device_log.txt
Could you give me any advice why the same program runs on 2 boards which have been set max frequency but still results in different execution time? If you need I will upload my program here.
Thank you
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/dividiti/ck-caffe/issues/123#issuecomment-348857008, or mute the thread https://github.com/notifications/unsubscribe-auth/AGSsuux5i9Sv-TB5S74q_XXE2-KU_UhAks5s83K7gaJpZM4QyMDw .
See a potential issue below.
BTW, the new CLBlast may have regressions on older architectures (though there are plans to make it more adaptive). To check it, you may want to use dvdt profiler to profile OpenCL kernels in Caffe:
$ ck benchmark program:... --dvdt_prof
Also, if you notice some errors in the platform scripts, please feel free to update them and provide a patch. The main idea behind CK is to collaboratively understand regressions/reproducibility issues and solve them ... Thanks!!!
In the new device log, I see:
Setting GPU frequency to max (if supported) ...
CMD to set GPU frequency:
export CK_CPU_FREQUENCY=max;/home/odroid/CK/ck-env/platform.init/generic-odroid/ck-set-gpu-performance
I would expect to see export CK_GPU_FREQUENCY here...