screwdriver
screwdriver copied to clipboard
Cannot open the list view for a specific pipeline
What happened: Cannot open the list view for a specific pipeline. At that time, the following message will be displayed.

The status code is 500
or net::ERR_FAILED
.
What you expected to happen:
- The list view will be displayed correctly.
How to reproduce it:
- Show the list view in a specific pipeline
- Maybe a pipeline with more jobs?
cc @adong Maybe we need to batch the requests ?
Also we should have an API which just takes pipeline ID
Additional information.
Slow queries were observed. It looks like the build status query is taking about 75 seconds.
Pipeline information. Number of jobs: 3900
The number of builds related to it is as follows.
Number of builds
jobid is dummy
mysql> select count(id) from builds where jobid=xxxxxxx;
+-----------+
| count(id) |
+-----------+
| 358 |
+-----------+
1 row in set (0.02 sec)
mysql> select count(id) from builds where jobid=xxxxxxx;
+-----------+
| count(id) |
+-----------+
| 627 |
+-----------+
1 row in set (0.02 sec)
mysql> select count(id) from builds where jobid=xxxxxx;
+-----------+
| count(id) |
+-----------+
| 1701 |
+-----------+
1 row in set (0.02 sec)
mysql> select count(id) from builds where jobid=xxxxxx;
+-----------+
| count(id) |
+-----------+
| 0 |
+-----------+
1 row in set (0.02 sec)
mysql> select count(id) from builds where jobid=xxxxxx;
+-----------+
| count(id) |
+-----------+
| 2312 |
+-----------+
1 row in set (0.02 sec)
mysql> select count(id) from builds where jobid=xxxxxx;
+-----------+
| count(id) |
+-----------+
| 2653 |
+-----------+
1 row in set (0.03 sec)
mysql> select count(id) from builds where jobid=xxxxxx;
+-----------+
| count(id) |
+-----------+
| 3 |
+-----------+
1 row in set (0.02 sec)
mysql> select count(id) from builds where jobid=xxxxxxx;
+-----------+
| count(id) |
+-----------+
| 1151 |
+-----------+
@jithine @kumada626
The cause of this problem is that the raw query takes a long time to execute. raw query.
In my production, it took more than 1 minute to execute.
I will share ideas to solve this problem. Tell me what you think.
-
Improving the performance of raw query.
-
Change the build status acquisition from synchronous to asynchronous. Instead of getting the status of the build in a single run, get it as well as the coverage. Note: The number of accesses to the API will increase, so the number of accesses needs to be controlled.
We found this issue still have a performance perspective.
When a number of builds for a specific job is over 8000, /builds/statuses
API is timed out. But still running MySQL queries so the load is high (about 90% of 8 vCPUs of our MySQL server) for 15 minutes.
@jithine Do you have a similar issue with PostgreSQL?
We have a data retention policy 30 days. Hence I don't think we are seeing this issue because of our cleanup.