khiops
khiops copied to clipboard
Improve concatenation with drivers
Description
In most of the parallel tasks, slaves produce files (called chunk) that must be concatenated to produce a final file. The concatenation process basically consists of:
- Each chunk file is sent to the master via MPI
- The master writes/appends the received chunk to the result file
Questions/Ideas
- Use drivers for concatenation:
- Each slave writes directly to the cloud
- We call a new method form the drivers dedicated to concatenation. It should be more efficient than the write/append that is not very cloud native.
- Another solution would be to write chunks directly to the cloud as part of a multipart file
@bruno-at-orange to define the API Tristan to implement in the Azure Driver + to bench the impact // @bruno-at-orange to implement it in khiops-core
regarder : #778