Scrape queries are logged to slow log - add long_query_time start parameter
In our setup we are running several databases with quite fast queries and mostly very less slow queries from the application. Thats why we setted the long_query_time for all (150) database (clusters) to 0.01 which is a fine value for our usual work load.
Unfortunately the mysqld_exporter runs queries which takes longer then 0.01 second, sometime more then 1 seconds and so we have slow logs logged for those scrape queries, sometimes the exporter queries are the only slows in the log. This is really annoying since this gives us a not really good monitoring of slow queries where we only would like to see statistics about slow queries coming from application users but not from metrics scraping.
There is the flag exporter.log_slow_filter, but this one only filters out the "non tmp disk table" queries. All queries writing tmp disk tables are still logged to the slow log.
And this tmp disk table queries are more then the non tmp disk table queries, so the existing flag has some effect, but not reaches the goal to avoid half of the queries to be logged.
Long story short, a dynamic setting for long_query_time as a exporter parameter would solve this problem. When it would be possible to set the long_query_time up to 10 seconds for the exporter sessions, this problem should be fixed.
Can you please add such a parameter to define a long_query_time value, which is then used for the mysql session the exporter uses?
The would be great!
Some code examples: Here you see a half day slow query log ... there are only scrape queries in (at least the top 4):
# Overall: 11.12k total, 41 unique, 0.28 QPS, 0.01x concurrency __________
# Time range: 2023-10-12T00:00:13 to 2023-10-12T11:12:02
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 356s 10ms 190ms 32ms 68ms 20ms 23ms
# Lock time 35s 0 95ms 3ms 21ms 8ms 98us
# Rows sent 1.70M 0 648 160.09 381.65 167.65 97.36
# Rows examine 12.41M 0 3.11k 1.14k 3.04k 1.28k 381.65
# Rows affecte 0 0 0 0 0 0 0
# Bytes sent 98.64M 11 53.75k 9.08k 19.40k 9.15k 6.96k
# Merge passes 0 0 0 0 0 0 0
# Tmp tables 1.63M 0 384 153.39 381.65 158.39 107.34
# Tmp disk tbl 322.99k 0 73 29.74 72.65 30.03 23.65
# Tmp tbl size 2.59G 0 3.21M 243.92k 590.13k 341.98k 106.99k
# Query size 3.88M 16 1.04k 365.80 537.02 195.23 487.09
# InnoDB:
# IO r bytes 0 0 0 0 0 0 0
# IO r ops 0 0 0 0 0 0 0
# IO r wait 0 0 0 0 0 0 0
# pages distin 133.81k 3 40 19.12 34.95 14.99 5.75
# queue wait 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0
# Boolean:
# Filesort 4% yes, 95% no
# Full join 34% yes, 65% no
# Full scan 99% yes, 0% no
# Tmp table 67% yes, 32% no
# Tmp table on 64% yes, 35% no
# Profile
# Rank Query ID Response time Calls R/Call V/M
# ==== =================================== ============== ===== ====== ===
# 1 0xE7F53B26FE0D399A2DBAFBAD1F732B72 196.0669 55.0% 3359 0.0584 0.00 SELECT information_schema.columns
# 2 0x3A86802FFCCBF26D88901A51719CB50B 75.8887 21.3% 3154 0.0241 0.00 SELECT information_schema.tables
# 3 0x6E905E29D0B3D1EA1A2193E75E153007 64.6393 18.1% 3281 0.0197 0.00 SELECT performance_schema.events_waits_summary_global_by_event_name
# 4 0xC25C941D3A5CD2EC72775047662073FB 5.8551 1.6% 429 0.0136 0.00 SELECT information_schema.SCHEMATA information_schema.TABLES
# MISC 0xMISC 13.9295 3.9% 897 0.0155 0.0 <37 ITEMS>
The 1st, 2nd and 4th are tmp disk table queries, the 3rd can be avoided when setting exporter.log_slow_filter
# Query 1: 0.08 QPS, 0.00x concurrency, ID 0xE7F53B26FE0D399A2DBAFBAD1F732B72 at byte 147682
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: 2023-10-12T00:00:13 to 2023-10-12T11:32:43
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 30 3463
# Exec time 55 202s 35ms 190ms 58ms 82ms 14ms 56ms
# Lock time 70 25s 132us 95ms 7ms 31ms 12ms 287us
# Rows sent 0 13.53k 4 4 4 4 0 4
# Rows examine 82 10.51M 3.11k 3.11k 3.11k 3.11k 0 3.11k
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 2 2.08M 631 631 631 631 0 631
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 75 1.27M 384 384 384 384 0 384
# Tmp disk tbl 74 246.87k 73 73 73 73 0 73
# Tmp tbl size 73 1.96G 592.00k 592.00k 592.00k 592.00k 0 592.00k
# Query size 44 1.79M 542 542 542 542 0 542
# InnoDB:
# IO r bytes 0 0 0 0 0 0 0 0
# IO r ops 0 0 0 0 0 0 0 0
# IO r wait 0 0 0 0 0 0 0 0
# pages distin 86 118.62k 34 40 35.08 34.95 0.15 34.95
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# Boolean:
# Full join 100% yes, 0% no
# Full scan 100% yes, 0% no
# Tmp table 100% yes, 0% no
# Tmp table on 100% yes, 0% no
# String:
# Databases information_schema
# Hosts localhost
# InnoDB trxID 0
# Users exporter
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms ################################################################
# 100ms #
# 1s
# 10s+
# Tables
# SHOW TABLE STATUS FROM `information_schema` LIKE 'columns'\G
# SHOW CREATE TABLE `information_schema`.`columns`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT t.table_schema, t.table_name, column_name, auto_increment,
pow(2, case data_type
when 'tinyint' then 7
when 'smallint' then 15
when 'mediumint' then 23
when 'int' then 31
when 'bigint' then 63
end+(column_type like '% unsigned'))-1 as max_int
FROM information_schema.columns c
STRAIGHT_JOIN information_schema.tables t
ON BINARY t.table_schema = c.table_schema AND BINARY t.table_name = c.table_name
WHERE c.extra = 'auto_increment' AND t.auto_increment IS NOT NULL\G
# Query 2: 0.08 QPS, 0.00x concurrency, ID 0x3A86802FFCCBF26D88901A51719CB50B at byte 145890
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: 2023-10-12T00:00:13 to 2023-10-12T11:32:43
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 28 3248
# Exec time 21 78s 11ms 145ms 24ms 40ms 9ms 22ms
# Lock time 23 8s 55us 49ms 3ms 13ms 5ms 93us
# Rows sent 18 320.02k 14 101 100.89 97.36 2.93 97.36
# Rows examine 2 320.02k 14 101 100.89 97.36 2.93 97.36
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 22 22.07M 1.86k 6.96k 6.96k 6.96k 186.24 6.96k
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 20 354.82k 2 112 111.86 107.34 3.70 107.34
# Tmp disk tbl 22 76.04k 1 24 23.97 23.65 0.79 23.65
# Tmp tbl size 13 354.81M 0 112.00k 111.86k 106.99k 3.75k 106.99k
# Query size 38 1.53M 494 502 494.01 487.09 0 487.09
# InnoDB:
# IO r bytes 0 0 0 0 0 0 0 0
# IO r ops 0 0 0 0 0 0 0 0
# IO r wait 0 0 0 0 0 0 0 0
# pages distin 12 16.81k 5 8 5.30 5.75 0.47 4.96
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# Boolean:
# Full scan 100% yes, 0% no
# Tmp table 100% yes, 0% no
# Tmp table on 99% yes, 0% no
# String:
# Databases information_schema
# Hosts localhost
# InnoDB trxID 0
# Users exporter
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms ################################################################
# 100ms #
# 1s
# 10s+
# Tables
# SHOW TABLE STATUS FROM `information_schema` LIKE 'tables'\G
# SHOW CREATE TABLE `information_schema`.`tables`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT
TABLE_SCHEMA,
TABLE_NAME,
TABLE_TYPE,
ifnull(ENGINE, 'NONE') as ENGINE,
ifnull(VERSION, '0') as VERSION,
ifnull(ROW_FORMAT, 'NONE') as ROW_FORMAT,
ifnull(TABLE_ROWS, '0') as TABLE_ROWS,
ifnull(DATA_LENGTH, '0') as DATA_LENGTH,
ifnull(INDEX_LENGTH, '0') as INDEX_LENGTH,
ifnull(DATA_FREE, '0') as DATA_FREE,
ifnull(CREATE_OPTIONS, 'NONE') as CREATE_OPTIONS
FROM information_schema.tables
WHERE TABLE_SCHEMA = 'sys'\G
# Query 3: 0.08 QPS, 0.00x concurrency, ID 0x6E905E29D0B3D1EA1A2193E75E153007 at byte 4997524
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: 2023-10-12T00:00:13 to 2023-10-12T10:56:32
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 28 3281
# Exec time 17 65s 11ms 77ms 20ms 33ms 7ms 17ms
# Lock time 0 283ms 17us 10ms 86us 93us 426us 44us
# Rows sent 72 1.25M 398 398 398 398 0 398
# Rows examine 9 1.25M 398 398 398 398 0 398
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 64 64.57M 20.15k 20.15k 20.15k 20.15k 0 20.15k
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 8 365.27k 114 114 114 114 0 114
# Boolean:
# Full scan 100% yes, 0% no
# String:
# Databases information_schema
# Hosts localhost
# Users exporter
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms ################################################################
# 100ms
# 1s
# 10s+
# Tables
# SHOW TABLE STATUS FROM `performance_schema` LIKE 'events_waits_summary_global_by_event_name'\G
# SHOW CREATE TABLE `performance_schema`.`events_waits_summary_global_by_event_name`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT EVENT_NAME, COUNT_STAR, SUM_TIMER_WAIT
FROM performance_schema.events_waits_summary_global_by_event_name\G
# Query 4: 0.01 QPS, 0.00x concurrency, ID 0xC25C941D3A5CD2EC72775047662073FB at byte 2972268
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: 2023-10-12T00:02:02 to 2023-10-12T11:32:39
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 3 441
# Exec time 1 6s 10ms 32ms 14ms 20ms 4ms 12ms
# Lock time 0 79ms 65us 13ms 178us 224us 691us 103us
# Rows sent 0 882 2 2 2 2 0 2
# Rows examine 1 149.01k 346 346 346 346 0 346
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 0 62.45k 145 145 145 145 0 145
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 2 35.31k 82 82 82 82 0 82
# Tmp disk tbl 1 4.74k 11 11 11 11 0 11
# Tmp tbl size 4 129.21M 300.03k 300.03k 300.03k 300.03k 0 300.03k
# Query size 3 138.67k 322 322 322 322 0 322
# InnoDB:
# IO r bytes 0 0 0 0 0 0 0 0
# IO r ops 0 0 0 0 0 0 0 0
# IO r wait 0 0 0 0 0 0 0 0
# pages distin 1 1.40k 3 6 3.25 3.89 0.55 2.90
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# Boolean:
# Filesort 100% yes, 0% no
# Full join 100% yes, 0% no
# Full scan 100% yes, 0% no
# Tmp table 100% yes, 0% no
# Tmp table on 100% yes, 0% no
# String:
# Databases information_schema
# Hosts localhost
# InnoDB trxID 0
# Users exporter
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms ################################################################
# 100ms
# 1s
# 10s+
# Tables
# SHOW TABLE STATUS FROM `information_schema` LIKE 'SCHEMATA'\G
# SHOW CREATE TABLE `information_schema`.`SCHEMATA`\G
# SHOW TABLE STATUS FROM `information_schema` LIKE 'TABLES'\G
# SHOW CREATE TABLE `information_schema`.`TABLES`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT S.SCHEMA_NAME AS 'schema', COUNT(T.TABLE_NAME) AS tables_count FROM information_schema.SCHEMATA AS S LEFT JOIN information_schema.TABLES AS T ON S.SCHEMA_NAME = T.TABLE_SCHEMA AND T.TABLE_TYPE = 'BASE TABLE' WHERE S.SCHEMA_NAME NOT IN ('information_schema','mysql','performance_schema','sys') GROUP BY 1 WITH ROLLUP\G
Sure, adding long_query_time to the session params is a great idea. It should be simple enough to do. Would you be up to adding this feature? You can see the params and related variables here: https://github.com/prometheus/mysqld_exporter/blob/main/collector/exporter.go#L45