beam icon indicating copy to clipboard operation
beam copied to clipboard

Improve parallelism of closing files in FileIO

Open damccorm opened this issue 3 years ago • 2 comments

Currently close happens in processElement which is per-window. If there are many windows firing this can throttle throughput waiting for IO instead of closing in parallel in finishBundle.

Imported from Jira BEAM-12776. Original Jira may contain additional context. Reported by: scwhittle.

damccorm avatar Jun 04 '22 21:06 damccorm

@scwhittle A fix in https://github.com/apache/beam/pull/15354 seems to be causing OOMs for certain customer workflows. The customer specifically bounded the number of parallel closes to 2 by patching the code to work around the issue.

lukecwik avatar Aug 09 '22 16:08 lukecwik

Potential reduction in OOM potential in https://github.com/apache/beam/pull/22645

lukecwik avatar Aug 09 '22 16:08 lukecwik