OpenCC
OpenCC copied to clipboard
Converting speed slow since ver.1.1.x
First of all, thanks for this great product. We noticed that the ver1.1.x conversion speed is far slower than ver1.0.x. What I tested as follows:
### generate test data ###
$ printf "Open Chinese Convert 開放中文轉換\n%.0s" {1..50000} > /tmp/data_50k.in
$ printf "Open Chinese Convert 開放中文轉換\n%.0s" {1..500000} > /tmp/data_500k.in
#### ver.1.0.6 ###
$ time opencc -i /tmp/data_50k.in -o /tmp/data.out
real 0m0.105s
user 0m0.073s
sys 0m0.027s
$ time opencc -i /tmp/data_500k.in -o /tmp/data.out
real 0m0.835s
user 0m0.722s
sys 0m0.061s
#### ver.1.1.1 ###
$ time opencc -i /tmp/data_50k.in -o /tmp/data.out
real 1m0.930s
user 1m0.285s
sys 0m0.074s
# I didn't wait to finish 500k, since it took too long
Are there any ways to improve the performance of ver.1.1.x? Or What can I do to gain some performance?
Thank you
It does seem unreasonably slow from your log, but I can't reproduce the problem on my side.
Did you try to run make benchmark
?
The benchmark results are as follows:
test 1
Start 1: BenchmarkTest
1: Test command: /root/OpenCC/build/perf/src/benchmark/performance
1: Test timeout computed to be: 1500
1: 2020-07-08 16:11:44
1: Running /root/OpenCC/build/perf/src/benchmark/performance
1: Run on (1 X 2200 MHz CPU )
1: CPU Caches:
1: L1 Data 32 KiB (x1)
1: L1 Instruction 32 KiB (x1)
1: L2 Unified 256 KiB (x1)
1: L3 Unified 56320 KiB (x1)
1: Load Average: 1.22, 0.61, 0.29
1: ------------------------------------------------------------------
1: Benchmark Time CPU Iterations
1: ------------------------------------------------------------------
1: BM_Initialization/s2t 25856465 ns 25663044 ns 27
1: BM_Initialization/t2s 1397494 ns 1386236 ns 499
1: BM_Initialization/s2tw 25662682 ns 25443130 ns 27
1: BM_Initialization/s2twp 26094306 ns 25921892 ns 27
1: BM_Initialization/tw2s 1444697 ns 1433357 ns 489
1: BM_Initialization/tw2sp 1784297 ns 1770124 ns 394
1: BM_Initialization/s2hk 25904954 ns 25728353 ns 27
1: BM_Initialization/hk2s 1544386 ns 1528326 ns 457
1: BM_Initialization/t2jp 149117 ns 148000 ns 4774
1: BM_Initialization/jp2t 273391 ns 271369 ns 2571
1: BM_Convert 673 ms 670 ms 1
1/1 Test #1: BenchmarkTest .................... Passed 10.78 sec
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 10.79 sec
I don't know if the benchmark results are showing normal or not. All of the tests I run on CentOS 8 with different machine types and they all have performance issue on version 1.1.1.
I've tried different OSs on docker and they all have performance issue with ver.1.1.1.
$ docker run -it continuumio/miniconda:latest /bin/bash
### in miniconda container ###
$ printf "Open Chinese Convert 開放中文轉換\n%.0s" {1..50000} > /tmp/data_50k.in
$ apt update && apt install -y g++ make cmake doxygen && apt autoremove -y && apt autoclean -y
### ver.1.1.1 (latest) ###
$ cd /opt && git clone https://github.com/BYVoid/OpenCC.git && cd OpenCC
$ make clean && make && make install
$ time opencc -i /tmp/data_50k.in -o /tmp/data.out
real 1m2.986s
user 1m2.313s
sys 0m0.067s
$ make clean && make benchmark
test 1
Start 1: BenchmarkTest
1: Test command: /opt/OpenCC/build/perf/src/benchmark/performance
1: Test timeout computed to be: 1500
1: 2020-07-09 07:29:46
1: Running /opt/OpenCC/build/perf/src/benchmark/performance
1: Run on (1 X 2492.39 MHz CPU )
1: CPU Caches:
1: L1 Data 32 KiB (x1)
1: L1 Instruction 32 KiB (x1)
1: L2 Unified 256 KiB (x1)
1: L3 Unified 6144 KiB (x1)
1: L4 Unified 131072 KiB (x1)
1: Load Average: 1.34, 0.60, 0.29
1: ------------------------------------------------------------------
1: Benchmark Time CPU Iterations
1: ------------------------------------------------------------------
1: BM_Initialization/hk2s 1591332 ns 1566249 ns 431
1: BM_Initialization/hk2t 174952 ns 172332 ns 4073
1: BM_Initialization/jp2t 297180 ns 291991 ns 2377
1: BM_Initialization/s2hk 24813060 ns 24530145 ns 30
1: BM_Initialization/s2t 25457546 ns 25103950 ns 28
1: BM_Initialization/s2tw 24847323 ns 24614516 ns 22
1: BM_Initialization/s2twp 28968724 ns 28396501 ns 25
1: BM_Initialization/t2hk 85209 ns 84054 ns 8246
1: BM_Initialization/t2jp 179191 ns 176227 ns 3365
1: BM_Initialization/t2s 1550568 ns 1521867 ns 511
1: BM_Initialization/tw2s 1960318 ns 1912924 ns 405
1: BM_Initialization/tw2sp 2415094 ns 2367266 ns 372
1: BM_Initialization/tw2t 125114 ns 122902 ns 4993
1: BM_Convert 605 ms 598 ms 1
1/1 Test #1: BenchmarkTest .................... Passed 14.14 sec
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 14.15 sec
### ver.1.0.6 ###
$ git checkout ver.1.0.6
$ make clean && make && make install
$ time opencc -i /tmp/data_50k.in -o /tmp/data.out
real 0m0.114s
user 0m0.086s
sys 0m0.015s
Version 1.0.6 doesn't have make benchmark option so that I cannot provide the result. Without it, I can't compare it with version 1.1.1 to see the differences. If you need more information, please let me know.
@BYVoid I can confirm this issue.
$ docker run -it continuumio/miniconda:latest /bin/bash
### in miniconda container ###
$ printf "Open Chinese Convert 開放中文轉換\n%.0s" {1..50000} > /tmp/data_50k.in
$ apt update && apt install -y g++ make cmake doxygen && apt autoremove -y && apt autoclean -y
### ver.1.1.1 (latest) ###
$ cd /opt && git clone https://github.com/BYVoid/OpenCC.git && cd OpenCC
$ make clean && make && make install
$ time opencc -i /tmp/data_50k.in -o /tmp/data.out
real 0m38.791s
user 0m38.737s
sys 0m0.018s
### ver.1.0.6 ###
$ git checkout ver.1.0.6
$ make clean && make && make install
$ time opencc -i /tmp/data_50k.in -o /tmp/data.out
real 0m0.152s
user 0m0.095s
sys 0m0.027s
It should be fixed by https://github.com/BYVoid/OpenCC/commit/c2e548e5e95c9a8ccc5c1e5feb259e8885ef32c6
The conversion speed has huge improved atfer c2e548e5e95c9a8ccc5c1e5feb259e8885ef32c6 but it's still 5 times slower than ver.1.0.6 or so. The test procedure is based on https://github.com/BYVoid/OpenCC/issues/478#issuecomment-655959879.
ver.1.1.2 (latest)
$ time opencc -i /tmp/data_50k.in -o /tmp/data.out
real 0m0.571s user 0m0.532s sys 0m0.020s
ver.1.0.6
$ time opencc -i /tmp/data_50k.in -o /tmp/data.out
real 0m0.116s user 0m0.086s sys 0m0.024s