cockroach icon indicating copy to clipboard operation
cockroach copied to clipboard

sql: add SHOW STATISTICS WITH FORECAST

Open michae2 opened this issue 3 years ago • 6 comments

sql/stats: replace eval.Context with tree.CompareContext

Most uses of eval.Context in the sql/stats package can actually be tree.CompareContext instead, so make the replacement.

Release note: None

sql/stats: bump histogram version to 2

In 22.2 as of 963deb8 we support multiple histograms for trigram- indexed strings. Let's bump the histogram version for this change, as we may want to know whether multiple histograms are possible for a given row in system.table_statistics.

(I suspect that during upgrades to 22.2 the 22.1 statistics builder will choke on these statistics, so maybe we should also backport a version check to 22.1.)

Also update avgRefreshTime to work correctly in multiple-histogram cases.

Release note: None

sql/stats: teach histogram.adjustCounts to remove empty buckets

Sometimes when adjusting counts down we end up with empty buckets in the histogram. They don't hurt anything, but they take up some memory (and some brainpower when examining test results). So, teach adjustCounts to remove them.

Release note: None

sql/stats: always use non-nil buckets for empty-table histograms

After 82b5926 I've been using the convention that nil histogram buckets = no histogram, and non-nil-zero-length histogram buckets = histogram on empty table. This is mostly useful for testing but is also important for forecasting histograms.

Fix a spot that wasn't following this convention.

Also, add some empty-table testcases and some other testcases for histogram.adjustCounts.

Release note: None

sql/stats: forecast table statistics

Add function to forecast table statistics based on observed statistics. These forecasts are based on linear regression models over time. For each set of columns with statistics, we construct a linear regression model over time for each statistic (row count, null count, distinct count, average row size, and histogram). If all models are good fits then we produce a statistics forecast for the set of columns.

Assists: #79872

Release note: None

sql: add SHOW STATISTICS WITH FORECAST

Add a new WITH FORECAST option to SHOW STATISTICS which calculates and displays forecasted statistics along with the existing table statistics.

Also, forbid injecting forecasted stats.

Assists: #79872

Release note (sql change): Add a new WITH FORECAST option to SHOW STATISTICS which calculates and displays forecasted statistics along with the existing table statistics.

michae2 avatar Feb 26 '22 01:02 michae2

This change is Reviewable

cockroach-teamcity avatar Feb 26 '22 01:02 cockroach-teamcity

This is now compiling and not crashing on some example statistics, so I wanted to open a draft PR as a sneak preview. It still needs tests before it is ready for review.

PRs after this will wire the forecasts into the stats_cache.

michae2 avatar Feb 26 '22 01:02 michae2

Added some tests, fixed some bugs. A few tests still need to be written.

michae2 avatar Mar 04 '22 01:03 michae2

This is RFAL. I need to write two more tests, but that can happen concurrently with reviewing.

michae2 avatar Mar 08 '22 06:03 michae2

Make some linters happy. 🙂

michae2 avatar Mar 08 '22 18:03 michae2

Rebased on the previous patches, but haven't yet addressed feedback. Hold off on reviewing just yet.

michae2 avatar Jul 31 '22 15:07 michae2

TFYRs!!

bors r+

michae2 avatar Aug 12 '22 23:08 michae2

:-1: Rejected by code reviews

craig[bot] avatar Aug 12 '22 23:08 craig[bot]

bors r+

michae2 avatar Aug 12 '22 23:08 michae2

Build failed (retrying...):

craig[bot] avatar Aug 13 '22 01:08 craig[bot]

Build succeeded:

craig[bot] avatar Aug 13 '22 04:08 craig[bot]