TCLB icon indicating copy to clipboard operation
TCLB copied to clipboard

Profiling of LatticeAccess

Open shkodm opened this issue 9 years ago • 8 comments

I ran profiling for the TCLB. Special interest is LatticeAccess.cpp.Rt parsing time. Full report for d2q9 and d3q27 models: https://gist.github.com/shkodm/607d98a0dfe3c244a1197016f4b125d7 Top list of compilation-time eaters for d3q27. Top of d2q9 is pretty much similar.

function name total.time total.pct self.time self.pct
"<Anonymous>" 145.46 100.00 2.42 1.66
"eval" 145.46 100.00 2.00 1.37
"lapply" 145.20 99.82 1.60 1.10
"FUN" 145.18 99.81 1.42 0.98
"txtlapply" 142.20 97.76 0.00 0.00
"field.access" 140.80 96.80 0.00 0.00
"Ops.gvector.apply" 114.84 78.95 0.12 0.08
"fun" 110.98 76.30 0.60 0.41
"offset.p" 95.22 65.46 0.06 0.04
"load.field" 91.98 63.23 0.00 0.00
"aggregate" 70.38 48.38 0.56 0.38
"aggregate.pAlg" 70.06 48.16 1.20 0.82
"*" 67.22 46.21 0.04 0.03
"*.pAlg" 65.74 45.19 1.80 1.24
"save.field" 48.82 33.56 0.00 0.00
"aggregate.data.frame" 48.46 33.32 1.90 1.31
"+" 40.68 27.97 0.16 0.11
"[" 33.28 22.88 1.22 0.84
"+.pAlg" 32.32 22.22 0.10 0.07
"C" 30.54 21.00 0.02 0.01
"ToC" 30.42 20.91 0.04 0.03

Any ideas about what can be done with it?

shkodm avatar Jun 17 '16 08:06 shkodm

@shkodm when you change to the new polyAlgebra package, does the "ToC" time decreases?

llaniewski avatar Jun 17 '16 09:06 llaniewski

@llaniewski , yes, as far as I remember, it becomes at least 5 times smaller. I will report exact results of profiling with the new package in the evening.

shkodm avatar Jun 17 '16 10:06 shkodm

With new polyAlgebra package, as requested:

function name total.time total.pct self.time self.pct
"Anonymous" 127.28 100.00 1.90 1.49
"eval" 127.28 100.00 1.40 1.10
"FUN" 126.88 99.69 1.22 0.96
"lapply" 126.86 99.67 1.64 1.29
"txtlapply" 123.92 97.36 0.00 0.00
"field.access" 122.92 96.57 0.02 0.02
"Ops.gvector.apply" 118.40 93.02 0.10 0.08
"fun" 114.12 89.66 0.50 0.39
"offset.p" 98.66 77.51 0.00 0.00
"load.field" 79.40 62.38 0.00 0.00
"aggregate" 72.04 56.60 0.42 0.33
"aggregate.pAlg" 71.82 56.43 1.42 1.12
"*" 68.00 53.43 0.06 0.05
"*.pAlg" 66.38 52.15 1.62 1.27
"aggregate.data.frame" 49.14 38.61 1.72 1.35
"save.field" 43.52 34.19 0.00 0.00
"+" 43.14 33.89 0.16 0.13
"+.pAlg" 34.06 26.76 0.18 0.14
"[" 28.72 22.56 0.92 0.72
"[.data.frame" 24.94 19.59 7.64 6.00
"-" 21.82 17.14 0.14 0.11
"-.pAlg" 19.66 15.45 0.08 0.06
"data.frame" 16.76 13.17 2.78 2.18
"match" 16.36 12.85 9.66 7.59
"as.factor" 15.14 11.90 0.26 0.20
"%in%" 14.96 11.75 3.10 2.44
"rbind" 14.04 11.03 3.68 2.89
"factor" 13.68 10.75 1.58 1.24
"pAlg" 11.74 9.22 0.52 0.41
"deparse" 11.48 9.02 2.80 2.20
"as.data.frame" 10.42 8.19 0.94 0.74
"match.arg" 10.02 7.87 1.04 0.82
"[[" 9.48 7.45 1.20 0.94
"new" 9.02 7.09 0.06 0.05
"initialize" 8.84 6.95 0.90 0.71
"standardGeneric" 8.78 6.90 0.26 0.20
"C" 8.60 6.76 0.00 0.00
"gsub" 8.56 6.73 1.66 1.30
"ToC" 8.48 6.66 0.04 0.03

shkodm avatar Jun 18 '16 10:06 shkodm

@shkodm Can you post profiling "script"? Something like profile.sh d2q9 or make profile d2q9 would be nice ;)

mdzik avatar Jun 20 '16 11:06 mdzik

I think you should rather look at self time: it should point to biggest eater, total time includes sub-calls, so main has always all of it You may want to look at getting rid of or reducing number of call to top few of those. This list represents more than half of all script time.

function name total.time total.pct self.time self.pct time so far
match 18,2 12,51 10,28 7,07 10,28
[.data.frame 29,58 20,34 8,56 5,88 18,84
data.frame 25,12 17,27 3,94 2,71 22,78
names 3,86 2,65 3,86 2,65 26,64
length 3,76 2,58 3,76 2,58 30,4
%in% 17,4 11,96 3,7 2,54 34,1
rbind 18,6 12,79 3,54 2,43 37,64
deparse 14 9,62 3,5 2,41 41,14
anyDuplicated 4,56 3,13 3,46 2,38 44,6
[[.data.frame 10,9 7,49 3,28 2,25 47,88
c 3,14 2,16 2,86 1,97 50,74
[<-.data.frame 7,56 5,2 2,54 1,75 53,28
Anonymous 145,46 100 2,42 1,66 55,7
.deparseOpts 7,16 4,92 2,3 1,58 58
unique 4,64 3,19 2,22 1,53 60,22
eval 145,46 100 2 1,37 62,22
gsub 8,98 6,17 2 1,37 64,22
pmatch 2,92 2,01 1,94 1,33 66,16
aggregate.data.frame 48,46 33,32 1,9 1,31 68,06
dim 5,22 3,59 1,9 1,31 69,96
paste 9,7 6,67 1,88 1,29 71,84
*.pAlg 65,74 45,19 1,8 1,24 73,64
factor 13,58 9,34 1,76 1,21 75,4
[[ 12,6 8,66 1,7 1,17 77,1
unlist 2,48 1,7 1,68 1,15 78,78
is.factor 1,68 1,15 1,68 1,15 80,46
lapply 145,2 99,82 1,6 1,1 82,06
unique.default 2,08 1,43 1,6 1,1 83,66
prettyNum 1,78 1,22 1,56 1,07 85,22
finish.pAlg 2,82 1,94 1,54 1,06 86,76
as.character 1,44 0,99 1,44 0,99 88,2
sys.call 1,44 0,99 1,44 0,99 89,64
FUN 145,18 99,81 1,42 0,98 91,06
as.data.frame 14,88 10,23 1,42 0,98 92,48
as.list 2,38 1,64 1,38 0,95 93,86
[ 33,28 22,88 1,22 0,84 95,08
.row_names_info 1,22 0,84 1,22 0,84 96,3

mdzik avatar Jun 20 '16 11:06 mdzik

@llaniewski Can we divide parts of LatticeAcces/Dynamics.R to get little bit more fine grained dependencies and avoid recompilation of LatticeAccess?

mdzik avatar Jun 20 '16 11:06 mdzik

@mdzik , there is no script really, I just enabled profiling in RTemplate in the makefile.main.Rt & compiled.

shkodm avatar Jun 20 '16 19:06 shkodm

Can you introduce make flag? or configure option?

mdzik avatar Jun 21 '16 07:06 mdzik