cockroach icon indicating copy to clipboard operation
cockroach copied to clipboard

workload: fix rand workload

Open renatolabs opened this issue 2 years ago • 2 comments

The rand workload generates random table definitions when intialized, and performs random writes to the first table created when run. It relies on the randgen package to generate random data and has internal logic to convert from Datum to the corresponding Go data structure to be passed to a database call.

This means that whenever randgen is able to generate new types of data, the rand workload needs to be changed accordingly. Since there were no automated means to detect when the rand workload had drifted (and this workload is probably not run very frequently), running rand could fail because it wouldn't know how to generate a Go data structure for a specific datum.

This commit updates the rand workload to add support for the missing data types and also adds a unit test that verifies that it is able to generate and insert data for all possible column types. If a new type is added to randgen.RandomDatum, this test should fail.

This also changes the random seed used in rand in each run.

renatolabs avatar Sep 21 '22 16:09 renatolabs

This change is Reviewable

cockroach-teamcity avatar Sep 21 '22 16:09 cockroach-teamcity

Note that this does not guarantee that no errors are seen when running the rand workload. One issue I observed in my tests is that random data generation is not constrained by the schema definition. In other words, the following is possible:

  • table is created with column c1 as INT8, and column c2 is defined as c1 + 200
  • randgen generates MaxInt64-1 as a value for c1
  • error: c2 overflows

Fixing this type of "bug" is out of scope for this PR as it's not related to the workload itself.

renatolabs avatar Sep 21 '22 16:09 renatolabs

The OIDs don't imply the widths. I guess you can make a char(2) type or whatever width you want but if you don't name the width I guess we do 1? I think I just wanted to make sure randgen always made a valid value and 1 will do that for all char widths.

I've learned now that you can access column widths via the information_schema.columns table's character_maximum_length column. That has to be done on a per column basis.

ajwerner avatar Sep 23 '22 15:09 ajwerner

bors r=smg260

TFTR!

renatolabs avatar Sep 28 '22 20:09 renatolabs

Build succeeded:

craig[bot] avatar Sep 28 '22 21:09 craig[bot]