citus
citus copied to clipboard
Refactor intermediate result (de-)serialize methods
This is a preparation step for adding alternative intermediate result formats. The idea is that all serialization/deserialization for intermediate results should go through the same file so we don't need to modify multiple places to add support for new result formats.
IntermediateResultEncoder is the main interface. To use it, first we create an encoder using IntermediateResultEncoderCreate()
, then for each row we call IntermediateResultEncoderReceive()
. Finally we call IntermediateResultEncoderDone()
. The Receive
and Done
calls may return a string info which needs to be flushed to the output medium (connection or file).
In this PR we don't refactor worker_partition_protocol.c and corresponding CopyTaskFilesFromDirectory()
since (1) refactoring them out turned out to be difficult (2) we are going to remove task-tracker in near future anyway.
An example of another result format added using this interface is https://github.com/citusdata/citus/pull/3480/.
Codecov Report
Merging #3490 into master will decrease coverage by
0.01%
. The diff coverage is95.00%
.
@@ Coverage Diff @@
## master #3490 +/- ##
==========================================
- Coverage 91.56% 91.54% -0.02%
==========================================
Files 179 180 +1
Lines 35452 35474 +22
==========================================
+ Hits 32460 32476 +16
- Misses 2992 2998 +6
I think an alternative approach could be to have a layering that is more like DestReceiver -> Encoder (binary, text, ..) -> Byte sink (e.g. file, broadcast, ...).
I think the current PR is an improvement regardless, but maybe something worth considering for the future.