cockroach icon indicating copy to clipboard operation
cockroach copied to clipboard

sql: optimize copy to bypass optimizer

Open cucaroach opened this issue 2 years ago • 2 comments

COPY FROM currently does type checking twice, once when reading the tuples using ParseAndRequireString and again in the optimizer using the expensive buildValues machinery. Instead of converting the already typed tuples into opt expressions just leave them as Datums and pass them through the optimizer untouched. Exploit existing pseudo table machinery that uses RowContainer to pre-populate the valuesNode with Datums. Also use a scratch buffer to populate the RowContainer instead of allocating new Datums for every row.

benchstat output:

name old time/op new time/op delta CopyFrom-32 7.78s ± 3% 6.93s ± 4% -10.89% (p=0.000 n=10+9)

name old mb/s new mb/s delta CopyFrom-32 0.82 ± 1% 0.94 ± 3% +15.18% (p=0.000 n=10+10)

name old rows/s new rows/s delta CopyFrom-32 6.76k ± 1% 7.79k ± 3% +15.17% (p=0.000 n=10+10)

name old alloc/op new alloc/op delta CopyFrom-32 3.29GB ± 2% 3.19GB ± 1% -3.11% (p=0.000 n=10+9)

name old allocs/op new allocs/op delta CopyFrom-32 11.6M ± 0% 8.3M ± 0% -28.93% (p=0.000 n=9+8)

Release note (performance improvement): Optimize the execution of COPY FROM.

cucaroach avatar Jul 05 '22 20:07 cucaroach

This change is Reviewable

cockroach-teamcity avatar Jul 05 '22 20:07 cockroach-teamcity

This is RFAL, adding Yahor to look at execution bits. Thanks!

cucaroach avatar Aug 09 '22 20:08 cucaroach

bors r+

cucaroach avatar Aug 16 '22 21:08 cucaroach

Build succeeded:

craig[bot] avatar Aug 16 '22 22:08 craig[bot]