cockroach icon indicating copy to clipboard operation
cockroach copied to clipboard

*: rename SQL Stats

Open Azhng opened this issue 2 years ago • 4 comments

Previously, SQL Stats refer to statement / transaction statistics. This is often confused with table statistics, which are used by the optimizer to better plan queries.

We need a better name to avoid terminology confusion.

Jira issue: CRDB-15788

Azhng avatar Apr 20 '22 21:04 Azhng

Can we make a thorough list of all the places that the statement / transcation statistics are named with some form of stats or statistics? A rough brainstorm:

  • The job_type for stats compactions is AUTO SQL STATS COMPACTION
  • session and cluster settings?
  • packages

I think we should prioritize user-facing names like settings and job types/descriptions. Package names aren't see by a customer and will cause less confusion. They're also easier to switch in the future, whereas things like setting names are not.

mgartner avatar Apr 20 '22 22:04 mgartner

There's also the crdb_internal.reset_sql_stats builtin

rafiss avatar Apr 25 '22 19:04 rafiss

The cluster settings for SQL activity stats also use the same namespace, sql.stats, as table statistics cluster settings:

Table statistics:

Setting Type Default Description
sql.stats.automatic_collection.enabled boolean true automatic statistics collection mode
sql.stats.automatic_collection.fraction_stale_rows float 0.2 target fraction of stale rows per table that will trigger a statistics refresh
sql.stats.automatic_collection.min_stale_rows integer 500 target minimum number of stale rows per table that will trigger a statistics refresh
sql.stats.histogram_collection.enabled boolean true histogram collection mode
sql.stats.multi_column_collection.enabled boolean true multi-column statistics collection mode
sql.stats.post_events.enabled boolean false if set, an event is logged for every CREATE STATISTICS job

SQL Activity Statistics:

Setting Type Default Description
sql.stats.cleanup.recurrence string @hourly cron-tab recurrence for SQL Stats cleanup job
sql.stats.flush.enabled boolean true if set, SQL execution statistics are periodically flushed to disk
sql.stats.flush.interval duration 10m0s the interval at which SQL execution statistics are flushed to disk, this value must be less than or equal to sql.stats.aggregation.interval
sql.stats.persisted_rows.max integer 1000000 maximum number of rows of statement and transaction statistics that will be persisted in the system tables
sql.stats.response.max integer 20000 the maximum number of statements and transaction stats returned in a CombinedStatements request

We'll have to figure out a migration plan for the settings that causes as little annoyance for customers as possible.

More immediately, I think we should make the descriptions of these settings more clear in our documentation.

mgartner avatar Apr 27 '22 19:04 mgartner

This also affects metric names: https://github.com/cockroachdb/cockroach/blob/2fdc0a62df08829dd96901fffb5fb1fbb0b0997d/pkg/sql/exec_util.go#L1031-L1084

mgartner avatar Apr 27 '22 19:04 mgartner

No longer a priority

maryliag avatar Nov 03 '23 20:11 maryliag

Reopening because there was confusion from a customer about this.

michae2 avatar Nov 30 '23 17:11 michae2