incubator-graphar
incubator-graphar copied to clipboard
[Feat][C++] Add a `WriterOption` to allow user to configure writer option like compression
Is your feature request related to a problem? Please describe. Currently the GraphAr C++ library supports to write chunks in different file formats (CSV, Parquet and ORC) with Arrow's internal file-format supports. Arrow provides writer options for file formats to configure options such as the compression type. But GraphAr only uses the default options to write: CSV: https://github.com/alibaba/GraphAr/blob/ad30121070c9dc115ac916ef620de29e2097af77/src/filesystem.cc#L205-L210 Parquet: https://github.com/alibaba/GraphAr/blob/ad30121070c9dc115ac916ef620de29e2097af77/src/filesystem.cc#L216-L220 ORC: https://github.com/alibaba/GraphAr/blob/ad30121070c9dc115ac916ef620de29e2097af77/src/filesystem.cc#L224-L225
Consider to add a GraphAr WriterOption
to allow users to configure the writer option.
Describe the solution you'd like
Implement a WriterOption
like:
class WriterOption {
class builder {
inline builder* compression(CompressionType);
inline std::shared_ptr<WriterOption> build();
}
}
and when write chunks with GraphAr, use:
WriterOption::builder builder;
builder.compression(CompressionType::ZSTD);
auto writer_option = builder.build()
auto writer = VertexChunkWriter(vertex_info, prefix, writer_option)
As a first issue, we can only consider to support the compression settings.
Additional context #75
cc/ @lixueclaire