cobrix icon indicating copy to clipboard operation
cobrix copied to clipboard

Add the ability to reassemble a multi-segment file

Open yruslan opened this issue 2 years ago • 0 comments

Background

Multisegment files often have this structure:

file header record
  segment header record
    main segment record
       child1 segment record
       child2 segment record
       ...
  segment footer record
  segment header record
    main segment record
       child1 segment record
       child2 segment record
       ...
  segment footer record
file footer record

Feature

Implement options that would allow reassembling multiple segment into records like this:

file header | segment header | main segment | child1 segment | child2 segment | segment footer |  file footer
file header | segment header | main segment | child1 segment | child2 segment | segment footer |  file footer
file header | segment header | main segment | child1 segment | child2 segment | segment footer |  file footer

Not every type of segment should be present. The only mandatory segment is the main segment of a record.

Proposed Solution

This kind of processing is possible using windowing functions. But this can be very slow.

Adding support for such a processing on the record reader level can significantly improve the performance.

yruslan avatar Apr 25 '23 07:04 yruslan