ChoETL icon indicating copy to clipboard operation
ChoETL copied to clipboard

Writing dates as parquet datetime types

Open ptyrlik1 opened this issue 2 years ago • 5 comments

I have a program that writes a list of objects of a specific type to parquet. The issue is when it is writing date properties to the parquet file they are saved as strings rather than a datetime type.

This is how I have my parser configured

using (var parser = new ChoParquetWriter(outSteam)
.Configure(c => c.Culture = CultureInfo.InvariantCulture)
.Configure(c => c.TypeConverterFormatSpec = new ChoTypeConverterFormatSpec { DateTimeFormat = "o" })

This is the definition of the property in the class

public DateTime? date_reported { get; set; }

And this is a example of what the date looks in the database I am reading from

2023-01-09 00:00:00.000

And this is how it is stores in the object image

ptyrlik1 avatar May 09 '23 17:05 ptyrlik1

Well, underlying parquet driver doesn't support datetime type, hence storing it as text.

Cinchoo avatar May 10 '23 12:05 Cinchoo

Is there a datetime like type that it does support such as datetimeoffset?

ptyrlik1 avatar May 11 '23 21:05 ptyrlik1

yes, there is way to use datetimeoffset. let me add it. Will update.

Cinchoo avatar May 12 '23 23:05 Cinchoo

Did you push this update and if so how is it used?

ptyrlik1 avatar Sep 10 '23 23:09 ptyrlik1

Yes, here is how you can control the output

                using (var w = new ChoParquetWriter(filePath)
                    .Configure(c => c.TreatDateTimeAsDateTimeOffset = true)
                    )
                {
                    w.Write(recs);
                }

Cinchoo avatar Sep 13 '23 18:09 Cinchoo