parquet4s issues

Draft: Migrate timestamp codecs to Instant

POC for #333.

Feat(core): make path filter configurable

This change adds a pathFilter option to ParquetReader builder interface because there are some situations where users needs to configure path filter predicates(e.g. They use `_` prefix for partition columns)....

i10416

Can I trouble you for code review to integrated newer version of parquet4s into delta DSR/DSW?

19

I am not sure how much effort, just asking if you would be willing and available to look over PR. It's currently stack at 1.2.1 and it's just too far...

MironAtHome

Read Schema of parquet file

2

First i want to thank you for this great library! I need to merge hundreds of small parquet files into bigger ones. Sadly they are not all the same schema...

ChrisMuki

Support vectored io introduced in Parquet 1.14

mjakubowski84

feat(akkaPekko): Retry mechanism for ParquetPartitioningFlow

4

Hi, In the akkaPekko module, I noticed that there is no handling mechanism in ParquetPartitioningFlow when calling `write` from the hadoop ParquetWriter which might throw an IOException. I thought it...

utkuaydn

Possiblity to write avro IndexedRecords to Parquet using ParquetStreams

3

` import com.github.mjakubowski84.parquet4s.ScalaCompat.stream.scaladsl.Sink import com.github.mjakubowski84.parquet4s._ import org.apache.avro.generic.IndexedRecord import org.apache.parquet.hadoop.ParquetFileWriter.Mode import org.apache.parquet.hadoop.metadata.CompressionCodecName import org.apache.pekko.Done import org.apache.pekko.stream.scaladsl._ import scala.concurrent.Future import scala.concurrent.duration.DurationInt object AvroToParquetSink { private val writeOptions = ParquetWriter.Options(writeMode = Mode.OVERWRITE, compressionCodecName...

dahiyahimanshu

feat(akkaPekko): Retry mechanism for ParquetPartitioningFlow

Implementation for the issue #351 I had to add that ClassTag stuff because sbt was complaining about type erasure on the match [here](https://github.com/mjakubowski84/parquet4s/pull/350/files#diff-1a01f20cd6d3cace915f9ae2b24ea5387cc68a30723f7a449bb2618dd0b0b703R393)

utkuaydn

ProjectionSchema self-inconsistency with partitioned source

3

Hi, I have faced self-inconsistency with handling of projection schema. When reading partitioned parquet using `ParquetReader.projectedGeneric(expectedSchema).options(...).read` the output rows contain partitioning columns even if `expectedSchema` doesn't. When reading not-partitioned parquet...

dpogibelskiy

parquet4s
parquet4s copied to clipboard

Metadata

Draft: Migrate timestamp codecs to Instant

Feat(core): make path filter configurable

Can I trouble you for code review to integrated newer version of parquet4s into delta DSR/DSW?

Read Schema of parquet file

Support vectored io introduced in Parquet 1.14

feat(akkaPekko): Retry mechanism for ParquetPartitioningFlow

Possiblity to write avro IndexedRecords to Parquet using ParquetStreams

feat(akkaPekko): Retry mechanism for ParquetPartitioningFlow

ProjectionSchema self-inconsistency with partitioned source

← Metadata

Owner

Metadata

parquet4s parquet4s copied to clipboard

Metadata

← Metadata

Owner

Metadata

parquet4s
parquet4s copied to clipboard