domjudge
domjudge copied to clipboard
DOMjudge has been running a very slow SQL
Description of the problem
Replace this line with a short description.
The day after a contest, we suddenly found out that DOMjudge could not be accessed.
Initially, we found that the database load was very high, using up almost all the cpu cores on the machine.
By troubleshooting the database's slow SQL, we found that DOMjudge had been running a SQL statement like this
SELECT DISTINCT
j0_.judgingid AS judgingid_0,
j0_.starttime AS starttime_1,
j0_.endtime AS endtime_2,
j0_.result AS result_3,
j0_.verified AS verified_4,
j0_.jury_member AS jury_member_5,
j0_.verify_comment AS verify_comment_6,
j0_.valid AS valid_7,
j0_.output_compile AS output_compile_8,
j0_.metadata AS metadata_9,
j0_.seen AS seen_10,
j0_.judge_completely AS judge_completely_11,
j0_.uuid AS uuid_12,
j0_.cid AS cid_13,
j0_.submitid AS submitid_14,
j0_.rejudgingid AS rejudgingid_15,
j0_.prevjudgingid AS prevjudgingid_16,
j0_.errorid AS errorid_17
FROM judging j0_
LEFT JOIN rejudging r1_ ON j0_.rejudgingid = r1_.rejudgingid
INNER JOIN judging_run j2_ ON j0_.judgingid = j2_.judgingid
INNER JOIN judgetask j3_ ON j2_.judgetaskid = j3_.judgetaskid
INNER JOIN judgehost j4_ ON j3_.judgehostid = j4_.judgehostid
WHERE j4_.hostname = 'judgedaemon-802-2'
AND j0_.judgingid = j3_.jobid
AND j2_.runresult IS NULL
AND (j0_.valid = 1 OR r1_.valid = 1)
AND j0_.result <> 'compiler-error';
I tried running this SQL and it was very slow.
show full processlist;
By running the above command on MariaDB, we do see that a similar SQL statement is being run all the time, which I guess is the reason for the high load on the DB.
So what can I do about this situation.
Your environment
Include details about your installation here.
- DOMjudge version (e.g. 7.0.0 or a github commit hash)
- Operating system / Linux distribution and version (e.g. Ubuntu 18.04)
- Webserver (e.g. Apache or nginx)
- DOMjudge/8.3.0DEV/948b599af
- Ubuntu20.04
- Nginx
Steps to reproduce
Replace this with a description how we can reproduce your bug.
- Step 1
- Step 2
- Step 3
Expected behaviour
Replace this line with what you would expect to happen.
Actual behaviour
Replace this line with what happens instead.
Any other information that you want to share?
Please include webserver, symfony and judgedaemon log snippets here as appropriate. Screenshots may help in case of UI bugs.
Seems to be this query: https://github.com/domjudge/domjudge/blob/main/webapp/src/Controller/API/JudgehostController.php#L176-L192
Doing an explain gives:
1 SIMPLE j4_ const PRIMARY,hostname hostname 258 const 1 Using index; Using temporary
1 SIMPLE j2_ ALL testcaseid,judgingid,IDX_29A6E6E13CBA64F2 92 Using where
1 SIMPLE j0_ eq_ref PRIMARY PRIMARY 4 domjudge.j2_.judgingid 1 Using where
1 SIMPLE r1_ eq_ref PRIMARY PRIMARY 4 domjudge.j0_.rejudgingid 1 Using where; Distinct
1 SIMPLE j3_ eq_ref PRIMARY,jobid,judgehostid,judgehostid_jobid,judgehostid_valid_priority,specific_type PRIMARY 4 domjudge.j2_.judgetaskid 1 Using where; Distinct
Dropping the DISTINCT would give:
1 SIMPLE j4_ const PRIMARY,hostname hostname 258 const 1 Using index
1 SIMPLE j2_ ALL testcaseid,judgingid,IDX_29A6E6E13CBA64F2 92 Using where
1 SIMPLE j0_ eq_ref PRIMARY PRIMARY 4 domjudge.j2_.judgingid 1 Using where
1 SIMPLE r1_ eq_ref PRIMARY PRIMARY 4 domjudge.j0_.rejudgingid 1 Using where
1 SIMPLE j3_ eq_ref PRIMARY,jobid,judgehostid,judgehostid_jobid,judgehostid_valid_priority,specific_type PRIMARY 4 domjudge.j2_.judgetaskid 1 Using where
I'm wondering if we would do the distinct check in PHP whether that would make it faster. @Dup4 could you try running the query without the DISTINCT on your database and see how fast it is?
Doing an explain gives:
1 SIMPLE j4_ const PRIMARY,hostname hostname 258 const 1 Using index; Using temporary 1 SIMPLE j2_ ALL testcaseid,judgingid,IDX_29A6E6E13CBA64F2 92 Using where 1 SIMPLE j0_ eq_ref PRIMARY PRIMARY 4 domjudge.j2_.judgingid 1 Using where 1 SIMPLE r1_ eq_ref PRIMARY PRIMARY 4 domjudge.j0_.rejudgingid 1 Using where; Distinct 1 SIMPLE j3_ eq_ref PRIMARY,jobid,judgehostid,judgehostid_jobid,judgehostid_valid_priority,specific_type PRIMARY 4 domjudge.j2_.judgetaskid 1 Using where; DistinctDropping the
DISTINCTwould give:1 SIMPLE j4_ const PRIMARY,hostname hostname 258 const 1 Using index 1 SIMPLE j2_ ALL testcaseid,judgingid,IDX_29A6E6E13CBA64F2 92 Using where 1 SIMPLE j0_ eq_ref PRIMARY PRIMARY 4 domjudge.j2_.judgingid 1 Using where 1 SIMPLE r1_ eq_ref PRIMARY PRIMARY 4 domjudge.j0_.rejudgingid 1 Using where 1 SIMPLE j3_ eq_ref PRIMARY,jobid,judgehostid,judgehostid_jobid,judgehostid_valid_priority,specific_type PRIMARY 4 domjudge.j2_.judgetaskid 1 Using whereI'm wondering if we would do the distinct check in PHP whether that would make it faster. @Dup4 could you try running the query without the
DISTINCTon your database and see how fast it is?
So still slow but less slow? How many judgings, rejudgings, judging_runs and judgetasks do you have?
And could you try the same query but then twice:
- Once with only j0_.valid =1
- Once with only r1_.valid = 1
? So that we get rid of the
OR
So still slow but less slow? How many judgings, rejudgings, judging_runs and judgetasks do you have?
And could you try the same query but then twice:
- Once with only j0_.valid =1
- Once with only r1_.valid = 1 ? So that we get rid of the
OR
I've checked again today and it may not be a SQL statement problem.
I found out through EXPLAIN that the submission table is not indexed.
Then by SHOW INDEX I suddenly found that many tables in the database have a Cardinality of 0 in their indexes.
When I executed ANALYZE TABLE judgeing, the following SQL statement was executed very fast.
SELECT DISTINCT
j0_.judgingid AS judgingid_0,
j0_.starttime AS starttime_1,
j0_.endtime AS endtime_2,
j0_.result AS result_3,
j0_.verified AS verified_4,
j0_.jury_member AS jury_member_5,
j0_.verify_comment AS verify_comment_6,
j0_.valid AS valid_7,
j0_.output_compile AS output_compile_8,
j0_.metadata AS metadata_9,
j0_.seen AS seen_10,
j0_.judge_completely AS judge_completely_11,
j0_.uuid AS uuid_12,
j0_.cid AS cid_13,
j0_.submitid AS submitid_14,
j0_.rejudgingid AS rejudgingid_15,
j0_.prevjudgingid AS prevjudgingid_16,
j0_.errorid AS errorid_17
FROM judging j0_
LEFT JOIN rejudging r1_ ON j0_.rejudgingid = r1_.rejudgingid
INNER JOIN judging_run j2_ ON j0_.judgingid = j2_.judgingid
INNER JOIN judgetask j3_ ON j2_.judgetaskid = j3_.judgetaskid
INNER JOIN judgehost j4_ ON j3_.judgehostid = j4_.judgehostid
WHERE j4_.hostname = 'judgedaemon-802-2'
AND j0_.judgingid = j3_.jobid
AND j2_.runresult IS NULL
AND (j0_.valid = 1 OR r1_.valid = 1)
AND j0_.result <> 'compiler-error';
However, some tables with a large number of rows of data have a Cardinality value.
As I am not very familiar with MariaDB, I would like to ask what would cause indexes to not be built for some tables, i.e. Cardinality = 0.
Is it because there are so few rows in the table that the value of Cardinality is not generated, but some SQL statements INNER JOIN or LEFT JOIN some tables with more rows such as judging or judgetask that make the query slow?
@Dup4 could you somehow give us a dump of your database? Doesn't have to be public if you don't want (on Slack DM or something)
@Dup4 could you somehow give us a dump of your database? Doesn't have to be public if you don't want (on Slack DM or something)
This problem was solved when I executed ANALYZE TABLE ${table_name}, does this database still make sense to reproduce?
If so, I can provide the dump of my database.
@Dup4 could you somehow give us a dump of your database? Doesn't have to be public if you don't want (on Slack DM or something)
This problem was solved when I executed ANALYZE TABLE
${table_name}, does this database still make sense to reproduce?If so, I can provide the dump of my database.
Yes, if/when we fix this we want to be sure that applying the fix to your database does indeed fix your issue. Also as other people might have that issue and that way we make sure that we actually fix their database with a proper migration.
Closing due to inactivity. Feel free to reopen if there is more info.