CDash
CDash copied to clipboard
build summary takes more than a minute to load
I'm on cdash 3.2.3 with a huge database (more than 900G). Loading a build takes forever:
cdash | [Mon Feb 26 10:36:32.160692 2024] [php:notice] [pid 282] [client 10.10.10.51:50871] [2024-02-26 10:36:32] production.WARNING: Slow page: /api/v1/buildSummary.php?buildid=1857965 took 83.14 seconds to load \n, referer: https://cdash.starqube.com/build/1857965
$ artisan db:show
...........................................................................
Database ............................................................. cdash
Host .............................................................. database
Port .................................................................. 5432
Username ............................................................. cdash
URL ........................................................................
Open Connections ......................................................... 2
Tables ................................................................. 118
Total Size ................................................... 961,320.01MiB
Table ........................................................... Size (MiB)
feed .................................................................. 0.02
dailyupdatefile ....................................................... 0.03
build2group ......................................................... 154.95
client_library ........................................................ 0.02
site2user ............................................................. 0.04
client_jobschedule2site ............................................... 0.01
client_jobschedule2submission ......................................... 0.02
client_jobschedule2os ................................................. 0.01
test2image ............................................................ 0.03
client_jobschedule2library ............................................ 0.01
saml2_tenants ......................................................... 0.02
client_jobschedule2build .............................................. 0.01
version ............................................................... 0.01
password .............................................................. 0.01
client_site2project ................................................... 0.01
client_jobschedule2compiler ........................................... 0.01
builderrordiff ....................................................... 15.89
client_jobschedule2cmake .............................................. 0.01
build2note ............................................................ 0.02
subprojectgroup ....................................................... 0.02
client_jobschedule .................................................... 0.05
dailyupdate ........................................................... 0.52
build2uploadfile ...................................................... 0.02
jobs .................................................................. 0.10
testmeasurement ................................................. 138,274.70
coveragesummarydiff ................................................... 2.67
build2grouprule ....................................................... 0.12
client_job ............................................................ 0.04
coveragesummary ....................................................... 2.75
buildfailure ......................................................... 23.63
client_jobschedule2toolkit ............................................ 0.01
updatefile ........................................................... 98.17
failed_jobs ........................................................... 0.05
lockout ............................................................... 0.01
coveragefilepriority .................................................. 0.07
labelemail ............................................................ 0.02
client_compiler ....................................................... 0.02
label2update .......................................................... 0.01
successful_jobs ....................................................... 0.06
repositories .......................................................... 0.01
coveragefilelog .................................................. 17,942.82
coverage .......................................................... 2,027.63
buildfailure2argument ............................................... 328.82
coveragefile2user ..................................................... 0.02
coveragefile ........................................................ 445.54
client_cmake .......................................................... 0.01
configureerrordiff .................................................... 0.16
siteinformation ....................................................... 0.09
build2test ...................................................... 363,458.87
configureerror ........................................................ 2.13
testoutput ...................................................... 347,350.63
client_toolkit ........................................................ 0.02
label2test ....................................................... 81,120.70
migrations ............................................................ 0.05
client_toolkitconfiguration ........................................... 0.03
buildfailureargument .................................................. 1.11
build ............................................................. 1,092.21
banner ................................................................ 0.02
configure ......................................................... 2,507.82
subproject2build ...................................................... 0.02
client_toolkitconfiguration2os ........................................ 0.02
buildfailuredetails .................................................. 73.68
client_toolkitversion ................................................. 0.03
uploadfile ............................................................ 0.02
password_resets ....................................................... 0.01
label2dynamicanalysis ............................................... 881.68
test .................................................................. 0.40
buildemail ............................................................ 0.03
client_site2program ................................................... 0.02
label2coveragefile .................................................... 0.01
user .................................................................. 0.05
projectjobscript ...................................................... 0.02
label2buildfailure .................................................... 0.02
site .................................................................. 0.06
buildgroup ............................................................ 0.08
related_builds ........................................................ 0.02
label2build ........................................................... 0.02
apitoken .............................................................. 0.01
client_site2library ................................................... 0.02
label ................................................................. 0.18
project ............................................................... 0.06
client_site2compiler .................................................. 0.02
image ................................................................. 0.06
summaryemail .......................................................... 0.03
client_site2cmake ..................................................... 0.02
buildfile ............................................................. 3.19
projectrobot .......................................................... 0.02
filesum ............................................................... 0.02
measurement ........................................................... 0.05
user2project .......................................................... 0.09
authtoken ............................................................. 0.03
build_filters ......................................................... 0.02
user2repository ....................................................... 0.06
build2configure ...................................................... 98.25
subproject ............................................................ 0.04
client_site ........................................................... 0.03
build2update ......................................................... 92.41
buildgroupposition .................................................... 0.07
testdiff ............................................................. 19.89
errorlog .............................................................. 0.05
project2repositories .................................................. 0.01
buildinformation ..................................................... 99.74
overview_components ................................................... 0.05
dynamicanalysissummary ............................................... 16.52
dynamicanalysisdefect ................................................ 12.86
subproject2subproject ................................................. 0.02
client_os ............................................................. 0.04
userstatistics ........................................................ 2.28
buildnote ............................................................. 0.02
builderror ........................................................ 1,912.35
dynamicanalysis ................................................... 3,026.26
usertemp .............................................................. 0.02
pending_submissions ................................................... 2.49
buildproperties ....................................................... 0.02
note .................................................................. 0.02
buildtesttime ........................................................ 39.00
blockbuild ............................................................ 0.05
buildupdate ......................................................... 184.63
Is there any way to improve this? Is this related to the database size?
Cheers
A couple things to check:
- If you have database access, you could do a
show processlistwhen the bulid summary page is slowly loading. That should provide some information about what queries are performing poorly. If you can share this information with us, we'll take a closer look to see if there's anything we can update in CDash to make these queries more performant. - You can also do a
SELECT min(starttime) FROM buildto see if automatic deletion of old builds is working properly or not. If you have ancient builds in your database, you can manually trigger the autoremove script by using thephp artisan build:remove allcommand from the root of your CDash directory. This will likely also be a very slow operation, but reclaiming disk space and reducing the size of your database should help improve CDash's performance.
Here are the requests for building a summary in my case, the very long one at the end is the one taking all the time:
LOG: statement: SELECT * FROM buildnote WHERE buildid='1861237' ORDER BY timestamp ASC
LOG: statement: select * from "site" where "site"."id" = 87 limit 1
LOG: statement: select count(*) as aggregate from "build2note" where "buildid" = 1861237
LOG: statement: SELECT * FROM buildinformation WHERE buildid = '1861237'
LOG: statement: SELECT label.id as labelid FROM label WHERE label.id IN
(SELECT labelid AS id FROM label2build
WHERE label2build.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2test
WHERE label2test.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2coveragefile
WHERE label2coveragefile.buildid = '1861237') OR label.id IN (
SELECT l2bf.labelid AS id
FROM label2buildfailure AS l2bf
LEFT JOIN buildfailure AS bf ON (bf.id=l2bf.buildfailureid)
LEFT JOIN buildfailuredetails AS bfd ON (bfd.id=bf.detailsid)
WHERE bfd.type='0' AND bf.buildid = '1861237') OR label.id IN (
SELECT l2bf.labelid AS id
FROM label2buildfailure AS l2bf
LEFT JOIN buildfailure AS bf ON (bf.id=l2bf.buildfailureid)
LEFT JOIN buildfailuredetails AS bfd ON (bfd.id=bf.detailsid)
WHERE bfd.type='1' AND bf.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2dynamicanalysis l2da
JOIN dynamicanalysis da ON l2da.dynamicanalysisid = da.id
WHERE da.buildid = '1861237')
as per your second remark, we actually dont remove any build, but that is a policy we are enforcing and I can't change it on my own 😞 The oldest one is from 2013...
What do you get when you EXPLAIN the slow query on your system? If you're using MySQL, ensure you run SET @@explain_format=TREE; to get tree structuring. Postgres does that automatically.
I must confess I'm very know to postgres... Here's the raw output:
cdash=# EXPLAIN SELECT label.id as labelid FROM label WHERE label.id IN
(SELECT labelid AS id FROM label2build
WHERE label2build.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2test
WHERE label2test.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2coveragefile
WHERE label2coveragefile.buildid = '1861237') OR label.id IN (
SELECT l2bf.labelid AS id
FROM label2buildfailure AS l2bf
LEFT JOIN buildfailure AS bf ON (bf.id=l2bf.buildfailureid)
LEFT JOIN buildfailuredetails AS bfd ON (bfd.id=bf.detailsid)
WHERE bfd.type='0' AND bf.buildid = '1861237') OR label.id IN (
SELECT l2bf.labelid AS id
FROM label2buildfailure AS l2bf
LEFT JOIN buildfailure AS bf ON (bf.id=l2bf.buildfailureid)
LEFT JOIN buildfailuredetails AS bfd ON (bfd.id=bf.detailsid)
WHERE bfd.type='1' AND bf.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2dynamicanalysis l2da
JOIN dynamicanalysis da ON l2da.dynamicanalysisid = da.id
WHERE da.buildid = '1861237');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on label (cost=12372958.93..12372996.58 rows=1167 width=8)
Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2) OR (hashed SubPlan 3) OR (hashed SubPlan 4) OR (hashed SubPlan 5) OR (hashed SubPlan 6))
SubPlan 1
-> Bitmap Heap Scan on label2build (cost=4.22..14.76 rows=9 width=8)
Recheck Cond: (buildid = '1861237'::bigint)
-> Bitmap Index Scan on label2build_buildid (cost=0.00..4.22 rows=9 width=0)
Index Cond: (buildid = '1861237'::bigint)
SubPlan 2
-> Seq Scan on label2test (cost=0.00..12135839.14 rows=21443 width=8)
Filter: (buildid = '1861237'::bigint)
SubPlan 3
-> Seq Scan on label2coveragefile (cost=0.00..29.62 rows=8 width=8)
Filter: (buildid = '1861237'::bigint)
SubPlan 4
-> Nested Loop (cost=8.71..52.19 rows=1 width=8)
-> Hash Join (cost=8.42..43.87 rows=1 width=16)
Hash Cond: (l2bf.buildfailureid = bf.id)
-> Seq Scan on label2buildfailure l2bf (cost=0.00..28.50 rows=1850 width=16)
-> Hash (cost=8.39..8.39 rows=3 width=16)
-> Index Scan using buildid17 on buildfailure bf (cost=0.29..8.39 rows=3 width=16)
Index Cond: (buildid = '1861237'::bigint)
-> Index Scan using buildfailuredetails_pkey on buildfailuredetails bfd (cost=0.29..8.31 rows=1 width=8)
Index Cond: (id = bf.detailsid)
Filter: (type = '0'::smallint)
SubPlan 5
-> Nested Loop (cost=8.71..52.19 rows=1 width=8)
-> Hash Join (cost=8.42..43.87 rows=1 width=16)
Hash Cond: (l2bf_1.buildfailureid = bf_1.id)
-> Seq Scan on label2buildfailure l2bf_1 (cost=0.00..28.50 rows=1850 width=16)
-> Hash (cost=8.39..8.39 rows=3 width=16)
-> Index Scan using buildid17 on buildfailure bf_1 (cost=0.29..8.39 rows=3 width=16)
Index Cond: (buildid = '1861237'::bigint)
-> Index Scan using buildfailuredetails_pkey on buildfailuredetails bfd_1 (cost=0.29..8.31 rows=1 width=8)
Index Cond: (id = bf_1.detailsid)
Filter: (type = '1'::smallint)
SubPlan 6
-> Hash Join (cost=420.88..236914.16 rows=1285 width=8)
Hash Cond: (l2da.dynamicanalysisid = da.id)
-> Seq Scan on label2dynamicanalysis l2da (cost=0.00..190185.40 rows=12345340 width=16)
-> Hash (cost=412.66..412.66 rows=658 width=4)
-> Index Scan using buildid6 on dynamicanalysis da (cost=0.43..412.66 rows=658 width=4)
Index Cond: (buildid = '1861237'::bigint)
(42 rows)
That's pretty useful. The sequential scans are probably the slowdown here. Since you're using Postgres, can you use EXPLAIN ANALYZE to actually run the query and report the query plan and associated costs? That will help narrow down which scan(s) in particular are problematic.
cdash=# EXPLAIN ANALYZE SELECT label.id as labelid FROM label WHERE label.id IN
(SELECT labelid AS id FROM label2build
WHERE label2build.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2test
WHERE label2test.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2coveragefile
WHERE label2coveragefile.buildid = '1861237') OR label.id IN (
SELECT l2bf.labelid AS id
FROM label2buildfailure AS l2bf
LEFT JOIN buildfailure AS bf ON (bf.id=l2bf.buildfailureid)
LEFT JOIN buildfailuredetails AS bfd ON (bfd.id=bf.detailsid)
WHERE bfd.type='0' AND bf.buildid = '1861237') OR label.id IN (
SELECT l2bf.labelid AS id
FROM label2buildfailure AS l2bf
LEFT JOIN buildfailure AS bf ON (bf.id=l2bf.buildfailureid)
LEFT JOIN buildfailuredetails AS bfd ON (bfd.id=bf.detailsid)
WHERE bfd.type='1' AND bf.buildid = '1861237') OR label.id IN
(SELECT labelid AS id FROM label2dynamicanalysis l2da
JOIN dynamicanalysis da ON l2da.dynamicanalysisid = da.id
WHERE da.buildid = '1861237');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on label (cost=12373361.83..12373399.48 rows=1167 width=8) (actual time=80963.010..80963.593 rows=815 loops=1)
Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2) OR (hashed SubPlan 3) OR (hashed SubPlan 4) OR (hashed SubPlan 5) OR (hashed SubPlan 6))
Rows Removed by Filter: 224
SubPlan 1
-> Bitmap Heap Scan on label2build (cost=4.22..14.76 rows=9 width=8) (actual time=0.006..0.006 rows=0 loops=1)
Recheck Cond: (buildid = '1861237'::bigint)
-> Bitmap Index Scan on label2build_buildid (cost=0.00..4.22 rows=9 width=0) (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (buildid = '1861237'::bigint)
SubPlan 2
-> Seq Scan on label2test (cost=0.00..12136242.04 rows=21444 width=8) (actual time=52464.790..79155.376 rows=1695 loops=1)
Filter: (buildid = '1861237'::bigint)
Rows Removed by Filter: 643159133
SubPlan 3
-> Seq Scan on label2coveragefile (cost=0.00..29.62 rows=8 width=8) (actual time=0.004..0.004 rows=0 loops=1)
Filter: (buildid = '1861237'::bigint)
SubPlan 4
-> Nested Loop (cost=8.71..52.19 rows=1 width=8) (actual time=0.653..0.653 rows=0 loops=1)
-> Hash Join (cost=8.42..43.87 rows=1 width=16) (actual time=0.653..0.653 rows=0 loops=1)
Hash Cond: (l2bf.buildfailureid = bf.id)
-> Seq Scan on label2buildfailure l2bf (cost=0.00..28.50 rows=1850 width=16) (actual time=0.617..0.617 rows=1 loops=1)
-> Hash (cost=8.39..8.39 rows=3 width=16) (actual time=0.017..0.017 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Index Scan using buildid17 on buildfailure bf (cost=0.29..8.39 rows=3 width=16) (actual time=0.017..0.017 rows=0 loops=1)
Index Cond: (buildid = '1861237'::bigint)
-> Index Scan using buildfailuredetails_pkey on buildfailuredetails bfd (cost=0.29..8.31 rows=1 width=8) (never executed)
Index Cond: (id = bf.detailsid)
Filter: (type = '0'::smallint)
SubPlan 5
-> Nested Loop (cost=8.71..52.19 rows=1 width=8) (actual time=0.019..0.019 rows=0 loops=1)
-> Hash Join (cost=8.42..43.87 rows=1 width=16) (actual time=0.018..0.018 rows=0 loops=1)
Hash Cond: (l2bf_1.buildfailureid = bf_1.id)
-> Seq Scan on label2buildfailure l2bf_1 (cost=0.00..28.50 rows=1850 width=16) (actual time=0.002..0.002 rows=1 loops=1)
-> Hash (cost=8.39..8.39 rows=3 width=16) (actual time=0.002..0.002 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Index Scan using buildid17 on buildfailure bf_1 (cost=0.29..8.39 rows=3 width=16) (actual time=0.002..0.002 rows=0 loops=1)
Index Cond: (buildid = '1861237'::bigint)
-> Index Scan using buildfailuredetails_pkey on buildfailuredetails bfd_1 (cost=0.29..8.31 rows=1 width=8) (never executed)
Index Cond: (id = bf_1.detailsid)
Filter: (type = '1'::smallint)
SubPlan 6
-> Hash Join (cost=420.88..236914.16 rows=1285 width=8) (actual time=1806.427..1806.427 rows=0 loops=1)
Hash Cond: (l2da.dynamicanalysisid = da.id)
-> Seq Scan on label2dynamicanalysis l2da (cost=0.00..190185.40 rows=12345340 width=16) (actual time=0.201..1166.732 rows=12344045 loops=1)
-> Hash (cost=412.66..412.66 rows=658 width=4) (actual time=0.043..0.043 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using buildid6 on dynamicanalysis da (cost=0.43..412.66 rows=658 width=4) (actual time=0.038..0.038 rows=1 loops=1)
Index Cond: (buildid = '1861237'::bigint)
Planning time: 1.370 ms
Execution time: 80963.710 ms
(49 rows)
Based on those results, an index on label2test and label2dynamicanalysis would probably speed things up substantially.
I added both indexes and the request is now almost instantaneous. Thanks so much for your help! Shall I consider this closed or do you wanna add those indexes by default in CDash?
We'd like to include these indexes in CDash, so this issue can stay open until that happens. I'll happily review any PRs you create to upstream them. Otherwise, I'll add it to my todo list.
Closed by #2103 and #2104.