parquet4s icon indicating copy to clipboard operation
parquet4s copied to clipboard

partitionBy to do not include partition key into filesystem path

Open s1rc0 opened this issue 3 years ago • 1 comments

Hi,

I use parquet4s with Akka Streams. Lib allows you to partition data via partitionBy() method. Any way to do not include specified partition key into path name?

In example if I have .partitionBy(Col("name")) parquet4s will create folder name=nameFromMyData. Is it possible to have folder with name nameFromMyData ?

Thanks in advance

s1rc0 avatar Dec 13 '21 18:12 s1rc0

Hi Sergey, No, it is not possible at the moment to have non-standard partition names. Parquet4s follows the most common standard which allows to restore both the column name and value during reading. However, if you wish to contribute, I think it should be quite easy to create additional optional parameter to builder, e.g. partitionFormat where you could use string format to define custom partition format.

mjakubowski84 avatar Dec 14 '21 08:12 mjakubowski84