Why doesn't Parquet currently support writing multiple row groups simultaneously?
Hi Parquet developers,
I have a question regarding the current implementation of Parquet. As far as I understand, Parquet does not support writing multiple row groups simultaneously. Could you please explain the reasoning behind this design choice?
Additionally, I am considering modifying Parquet to allow for multiple row groups to exist in memory and be flushed sequentially. From a high-level perspective, does this approach seem feasible? Are there any potential pitfalls or challenges I should be aware of?
Thank you for your time and assistance.
Best regards,
This would complicate the implementation and result in large memory footprint. Does it make sense to use multiple file writers instead?