TCLB Profiling of LatticeAccess

I ran profiling for the TCLB. Special interest is LatticeAccess.cpp.Rt parsing time. Full report for d2q9 and d3q27 models: https://gist.github.com/shkodm/607d98a0dfe3c244a1197016f4b125d7 Top list of compilation-time eaters for d3q27. Top of d2q9 is pretty much similar.

function name	total.time	total.pct	self.time	self.pct
"<Anonymous>"	145.46	100.00	2.42	1.66
"eval"	145.46	100.00	2.00	1.37
"lapply"	145.20	99.82	1.60	1.10
"FUN"	145.18	99.81	1.42	0.98
"txtlapply"	142.20	97.76	0.00	0.00
"field.access"	140.80	96.80	0.00	0.00
"Ops.gvector.apply"	114.84	78.95	0.12	0.08
"fun"	110.98	76.30	0.60	0.41
"offset.p"	95.22	65.46	0.06	0.04
"load.field"	91.98	63.23	0.00	0.00
"aggregate"	70.38	48.38	0.56	0.38
"aggregate.pAlg"	70.06	48.16	1.20	0.82
"*"	67.22	46.21	0.04	0.03
"*.pAlg"	65.74	45.19	1.80	1.24
"save.field"	48.82	33.56	0.00	0.00
"aggregate.data.frame"	48.46	33.32	1.90	1.31
"+"	40.68	27.97	0.16	0.11
"["	33.28	22.88	1.22	0.84
"+.pAlg"	32.32	22.22	0.10	0.07
"C"	30.54	21.00	0.02	0.01
"ToC"	30.42	20.91	0.04	0.03

Any ideas about what can be done with it?

Jun 17 '16 08:06 shkodm

@shkodm when you change to the new polyAlgebra package, does the "ToC" time decreases?

Jun 17 '16 09:06 llaniewski

@llaniewski , yes, as far as I remember, it becomes at least 5 times smaller. I will report exact results of profiling with the new package in the evening.

Jun 17 '16 10:06 shkodm

With new polyAlgebra package, as requested:

function name	total.time	total.pct	self.time	self.pct
"Anonymous"	127.28	100.00	1.90	1.49
"eval"	127.28	100.00	1.40	1.10
"FUN"	126.88	99.69	1.22	0.96
"lapply"	126.86	99.67	1.64	1.29
"txtlapply"	123.92	97.36	0.00	0.00
"field.access"	122.92	96.57	0.02	0.02
"Ops.gvector.apply"	118.40	93.02	0.10	0.08
"fun"	114.12	89.66	0.50	0.39
"offset.p"	98.66	77.51	0.00	0.00
"load.field"	79.40	62.38	0.00	0.00
"aggregate"	72.04	56.60	0.42	0.33
"aggregate.pAlg"	71.82	56.43	1.42	1.12
"*"	68.00	53.43	0.06	0.05
"*.pAlg"	66.38	52.15	1.62	1.27
"aggregate.data.frame"	49.14	38.61	1.72	1.35
"save.field"	43.52	34.19	0.00	0.00
"+"	43.14	33.89	0.16	0.13
"+.pAlg"	34.06	26.76	0.18	0.14
"["	28.72	22.56	0.92	0.72
"[.data.frame"	24.94	19.59	7.64	6.00
"-"	21.82	17.14	0.14	0.11
"-.pAlg"	19.66	15.45	0.08	0.06
"data.frame"	16.76	13.17	2.78	2.18
"match"	16.36	12.85	9.66	7.59
"as.factor"	15.14	11.90	0.26	0.20
"%in%"	14.96	11.75	3.10	2.44
"rbind"	14.04	11.03	3.68	2.89
"factor"	13.68	10.75	1.58	1.24
"pAlg"	11.74	9.22	0.52	0.41
"deparse"	11.48	9.02	2.80	2.20
"as.data.frame"	10.42	8.19	0.94	0.74
"match.arg"	10.02	7.87	1.04	0.82
"[["	9.48	7.45	1.20	0.94
"new"	9.02	7.09	0.06	0.05
"initialize"	8.84	6.95	0.90	0.71
"standardGeneric"	8.78	6.90	0.26	0.20
"C"	8.60	6.76	0.00	0.00
"gsub"	8.56	6.73	1.66	1.30
"ToC"	8.48	6.66	0.04	0.03

Jun 18 '16 10:06 shkodm

@shkodm Can you post profiling "script"? Something like profile.sh d2q9 or make profile d2q9 would be nice ;)

Jun 20 '16 11:06 mdzik

I think you should rather look at self time: it should point to biggest eater, total time includes sub-calls, so main has always all of it You may want to look at getting rid of or reducing number of call to top few of those. This list represents more than half of all script time.

function name	total.time	total.pct	self.time	self.pct	time so far
match	18,2	12,51	10,28	7,07	10,28
[.data.frame	29,58	20,34	8,56	5,88	18,84
data.frame	25,12	17,27	3,94	2,71	22,78
names	3,86	2,65	3,86	2,65	26,64
length	3,76	2,58	3,76	2,58	30,4
%in%	17,4	11,96	3,7	2,54	34,1
rbind	18,6	12,79	3,54	2,43	37,64
deparse	14	9,62	3,5	2,41	41,14
anyDuplicated	4,56	3,13	3,46	2,38	44,6
[[.data.frame	10,9	7,49	3,28	2,25	47,88
c	3,14	2,16	2,86	1,97	50,74
[<-.data.frame	7,56	5,2	2,54	1,75	53,28
Anonymous	145,46	100	2,42	1,66	55,7
.deparseOpts	7,16	4,92	2,3	1,58	58
unique	4,64	3,19	2,22	1,53	60,22
eval	145,46	100	2	1,37	62,22
gsub	8,98	6,17	2	1,37	64,22
pmatch	2,92	2,01	1,94	1,33	66,16
aggregate.data.frame	48,46	33,32	1,9	1,31	68,06
dim	5,22	3,59	1,9	1,31	69,96
paste	9,7	6,67	1,88	1,29	71,84
*.pAlg	65,74	45,19	1,8	1,24	73,64
factor	13,58	9,34	1,76	1,21	75,4
[[	12,6	8,66	1,7	1,17	77,1
unlist	2,48	1,7	1,68	1,15	78,78
is.factor	1,68	1,15	1,68	1,15	80,46
lapply	145,2	99,82	1,6	1,1	82,06
unique.default	2,08	1,43	1,6	1,1	83,66
prettyNum	1,78	1,22	1,56	1,07	85,22
finish.pAlg	2,82	1,94	1,54	1,06	86,76
as.character	1,44	0,99	1,44	0,99	88,2
sys.call	1,44	0,99	1,44	0,99	89,64
FUN	145,18	99,81	1,42	0,98	91,06
as.data.frame	14,88	10,23	1,42	0,98	92,48
as.list	2,38	1,64	1,38	0,95	93,86
[	33,28	22,88	1,22	0,84	95,08
.row_names_info	1,22	0,84	1,22	0,84	96,3

Jun 20 '16 11:06 mdzik

@llaniewski Can we divide parts of LatticeAcces/Dynamics.R to get little bit more fine grained dependencies and avoid recompilation of LatticeAccess?

Jun 20 '16 11:06 mdzik

@mdzik , there is no script really, I just enabled profiling in RTemplate in the makefile.main.Rt & compiled.

Jun 20 '16 19:06 shkodm

Can you introduce make flag? or configure option?

Jun 21 '16 07:06 mdzik