cobrix
cobrix copied to clipboard
Add the ability to reassemble a multi-segment file
Background
Multisegment files often have this structure:
file header record
segment header record
main segment record
child1 segment record
child2 segment record
...
segment footer record
segment header record
main segment record
child1 segment record
child2 segment record
...
segment footer record
file footer record
Feature
Implement options that would allow reassembling multiple segment into records like this:
file header | segment header | main segment | child1 segment | child2 segment | segment footer | file footer
file header | segment header | main segment | child1 segment | child2 segment | segment footer | file footer
file header | segment header | main segment | child1 segment | child2 segment | segment footer | file footer
Not every type of segment should be present. The only mandatory segment is the main segment of a record.
Proposed Solution
This kind of processing is possible using windowing functions. But this can be very slow.
Adding support for such a processing on the record reader level can significantly improve the performance.