spark-bigquery-connector
spark-bigquery-connector copied to clipboard
Read Datetime type column in Dataframe drop trailing zeroes on second site
Data In BigQuery: id c_ts 2 2022-03-14T06:05:07 3 2022-03-14T00:00:00 1 2022-03-14T06:05:00
DDL
CREATE TABLE `pypl-edw.pp_scratch.datetime_test`
(
id STRING,
c_ts DATETIME
)
Code: val tableDataFrame = sparkSession.read.format(Constants.BIGQUERY).load("pypl-edw.pp_scratch.datetime_test") tableDataFrame.show()
Dataframe:
+---+-------------------+
| id| c_ts|
+---+-------------------+
| 3| 2022-03-14T00:00|
| 1| 2022-03-14T06:05|
| 2|. 2022-03-14T06:05:07|
+---+-------------------+
I know the column type in spark is StringType, but you can see that it will be dropped after read into dataframe from BQ if the second site is 00.
2022-03-14T00:00:00 -> 2022-03-14T00:00 2022-03-14T06:05:00 -> 2022-03-14T06:05
It is not expected and will fail if i do any time function or insert to sink table.
Spark version: 2.3 and spark-bigquery-with-dependencies_2.11:0.22.0
Hi,
Any update on this? We are facing the same issue with spark 3.2 and spark-bigquery-with-dependencies_2.12-0.24.2
Hi @davidrabinowitz , wondering if there's a work-around for this. Seeing this issue also with Spark 2.4 and spark-bigquery-with-dependencies_2.12-0.23.2.jar. Is this isolated to the case where the datetime field ends in :00
(e.g. is at second 0 for the minute)? Thanks!