bigquery-emulator icon indicating copy to clipboard operation
bigquery-emulator copied to clipboard

Query response cannot be parsed when the result contains a nested timestamp

Open Kamulau opened this issue 2 years ago • 7 comments

I found the bug while writing an integration test using python and test-containers, and have also confirmed behavior via CLI.

Steps to reproduce via Python (used Python 3.8.16)

from google.cloud.bigquery import Table, SchemaField, Client

# Instantiate client = Client(), using emulator

test_project = "test-project"
test_dataset = "test_dataset"
test_schema = [
    SchemaField("test_timestamp_field", "TIMESTAMP", mode="REQUIRED")
]
test_table = f"{test_project}.{test_dataset}.test_table"
table = Table(test_table, schema=test_schema)
client.create_table(table)

from datetime import datetime
dtf = "%Y-%m-%d %H:%M:%S UTC"
dt1 = datetime.strptime("2021-11-29 22:00:00 UTC", dtf)

test_data = [
    {
        "test_timestamp_field": dt1.strftime(dtf)
    } 
]
client.insert_rows_json(test_table, test_data)
query1 = f"SELECT * from {test_table}"
query2 = f"SELECT STRUCT(test_timestamp_field) as nested_timestamp_field from {test_table}"

res1 = client.query(query1).result()
for r in res1:
    print(r)

res2 = client.query(query2).result()
for r in res2: # An error is thrown here
    print(r)

Python Error

Here is the error message when running the Python code above ValueError: invalid literal for int() with base 10: '2021-11-29T22:00:00Z'

Steps to reproduce using CLI

docker run -p 55687:9050 -it ghcr.io/goccy/bigquery-emulator:latest --project=test-project --dataset=test_dataset

bq --api http://localhost:55687 mk --project_id test-project --schema test_timestamp_field:timestamp -t test_dataset.test_table

bq --api http://localhost:55687 query --project_id test-project --nouse_legacy_sql \
"
INSERT test_dataset.test_table(test_timestamp_field) values ('2021-11-29 22:00:00 UTC')
"

bq --api http://localhost:55687 query --project_id test-project --nouse_legacy_sql \
'SELECT
   test_timestamp_field
 FROM
   `test-project`.test_dataset.test_table'


bq --api http://localhost:55687 query --project_id test-project \
'SELECT
   STRUCT(test_timestamp_field) as nested_timestamp_field
 FROM
   `test-project`.test_dataset.test_table' 

CLI Actual Response

Here is the response of the second query via CLI, with the nested timestamp field:

+------------------------------------------------------------+
|                   nested_timestamp_field                   |
+------------------------------------------------------------+
| {"test_timestamp_field":"<date out of range for display>"} |
+------------------------------------------------------------+

CLI Expected Response

Here is the expected response (tested w/ BQ):

+----------------------------------------+
|         nested_timestamp_field         |
+----------------------------------------+
| {"reserve_time":"2021-11-29 22:00:00"} |
+----------------------------------------+

An aside

I have also found that you can wrap the struct with TO_JSON_STRING, and it mostly works for CLI (although the timestamp is formatted for JSON), ~~but it does not~~ it appears to work with the Python client. Note that the timestamp lacks double quotes (which is what BQ returns in this case):

bq --api http://localhost:55687 query --project_id test-project \
'SELECT
   TO_JSON_STRING(STRUCT(test_timestamp_field)) as nested_timestamp_field
 FROM
   `test-project`.test_dataset.test_table' 
+-----------------------------------------------+
|            nested_timestamp_field             |
+-----------------------------------------------+
| {"test_timestamp_field":2021-11-29T22:00:00Z} |
+-----------------------------------------------+

I have also tested the above with BigQuery, and there were no errors and CLI returned responses, as expected.

Kamulau avatar Jul 31 '23 06:07 Kamulau

Also, perhaps this is related to https://github.com/goccy/bigquery-emulator/issues/32

Kamulau avatar Jul 31 '23 06:07 Kamulau

This is related to #265 I've fixed the TO_JSON() for the TimestampValue in this PR https://github.com/goccy/go-zetasqlite/pull/111/files#diff-a710e43b641c24f0f092d550d22fd984a9f09c62b7b4b8e41db3144c0217f592R2105-R2108

ohaibbq avatar Feb 02 '24 18:02 ohaibbq