spark-bigquery-connector icon indicating copy to clipboard operation
spark-bigquery-connector copied to clipboard

Read Datetime type column in Dataframe drop trailing zeroes on second site

Open bigcaiyujie opened this issue 2 years ago • 2 comments

Data In BigQuery: id c_ts 2 2022-03-14T06:05:07 3 2022-03-14T00:00:00 1 2022-03-14T06:05:00

    DDL
    CREATE TABLE `pypl-edw.pp_scratch.datetime_test`
    (
    id STRING,
    c_ts DATETIME
    )

Code: val tableDataFrame = sparkSession.read.format(Constants.BIGQUERY).load("pypl-edw.pp_scratch.datetime_test") tableDataFrame.show()

Dataframe:

      +---+-------------------+
      | id|               c_ts|
      +---+-------------------+
      |  3|   2022-03-14T00:00|
      |  1|   2022-03-14T06:05|
      |  2|. 2022-03-14T06:05:07|
      +---+-------------------+

I know the column type in spark is StringType, but you can see that it will be dropped after read into dataframe from BQ if the second site is 00.

2022-03-14T00:00:00 -> 2022-03-14T00:00 2022-03-14T06:05:00 -> 2022-03-14T06:05

It is not expected and will fail if i do any time function or insert to sink table.

Spark version: 2.3 and spark-bigquery-with-dependencies_2.11:0.22.0

bigcaiyujie avatar Mar 17 '22 02:03 bigcaiyujie

Hi,

Any update on this? We are facing the same issue with spark 3.2 and spark-bigquery-with-dependencies_2.12-0.24.2

jzrebiec avatar Jul 18 '22 09:07 jzrebiec

Hi @davidrabinowitz , wondering if there's a work-around for this. Seeing this issue also with Spark 2.4 and spark-bigquery-with-dependencies_2.12-0.23.2.jar. Is this isolated to the case where the datetime field ends in :00 (e.g. is at second 0 for the minute)? Thanks!

vicuna96 avatar Oct 09 '22 23:10 vicuna96