timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

Batch rows on access node for distributed COPY

Open akuzm opened this issue 2 years ago • 2 comments

Group the incoming rows into batches on access node before COPYing to data nodes.

Also switch the data node connections to nonblocking mode for sending COPY data, so that we can work with many data nodes concurrently.

This gives 2x-5x speedup on various COPY queries to distributed hypertables.

Part of #4285

akuzm avatar Jun 28 '22 14:06 akuzm

Codecov Report

Merging #4476 (d85d6d3) into main (33e4e55) will decrease coverage by 0.10%. The diff coverage is 87.78%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4476      +/-   ##
==========================================
- Coverage   90.99%   90.88%   -0.11%     
==========================================
  Files         224      224              
  Lines       42586    42785     +199     
==========================================
+ Hits        38751    38887     +136     
- Misses       3835     3898      +63     
Impacted Files Coverage Δ
tsl/src/remote/connection.c 88.29% <73.68%> (-0.21%) :arrow_down:
tsl/src/remote/dist_copy.c 87.85% <88.53%> (-5.53%) :arrow_down:
src/planner/constify_now.c 97.93% <90.90%> (-0.99%) :arrow_down:
src/guc.c 100.00% <100.00%> (ø)
tsl/src/nodes/data_node_copy.c 94.90% <100.00%> (ø)
src/bgw/scheduler.c 85.71% <0.00%> (-2.92%) :arrow_down:
src/loader/bgw_message_queue.c 85.52% <0.00%> (-2.64%) :arrow_down:
tsl/src/reorder.c 85.37% <0.00%> (-0.27%) :arrow_down:
src/bgw/job.c 93.57% <0.00%> (-0.20%) :arrow_down:
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7600896...d85d6d3. Read the comment docs.

codecov[bot] avatar Jun 28 '22 15:06 codecov[bot]

Looking through the remote_copy, I see that we use a replication factor of 2 (good), but it might be a good idea to test with a few other replication factors as well (in particular 1, but that might already be tested elsewhere) and also testing a few more corner-cases such as copying empty rows (to a table with no columns) and copying no rows at all (possibly a few more as well).

I added some tests with different replication factors and numbers of rows to dist_copy_long. Not sure how to test it w/o columns, is it possible to create a hypertable w/o a time column?

akuzm avatar Sep 19 '22 16:09 akuzm

Turns out the text COPY passthrough is just totally broken :weary:

https://github.com/timescale/timescaledb/issues/4761

akuzm avatar Sep 27 '22 18:09 akuzm