bigquery-emulator icon indicating copy to clipboard operation
bigquery-emulator copied to clipboard

Error processing job with UNNEST and query parameters

Open Lockeid opened this issue 1 year ago • 2 comments

Steps to reproduce:

> docker run -p 9060:9050 -it ghcr.io/goccy/bigquery-emulator:latest --project=test --log-level=debug

# Separate window
> bq --api http://0.0.0.0:9060 query --project_id=test --parameter='states:ARRAY<STRING>:["WA", "WI", "WV", "WY"]' "SELECT * FROM UNNEST(@states)"
BigQuery error in query operation: Error processing job 'test:bqjob_r15b2cf4b587ab4e9_0000018b19208f63_1': failed to analyze: INVALID_ARGUMENT: Values referenced in UNNEST must be arrays. UNNEST contains expression of type
INT64 [at 1:22]

I don't know if the issue comes from here or from zetasql

Lockeid avatar Oct 10 '23 10:10 Lockeid

This issue appears difficult to fix given the current design, so it might be better to temporarily use an alternative method or a workaround.

As a workaround, you can use CAST. Although it becomes redundant when running on the actual BigQuery, casting to the appropriate type can avoid the problem in many cases. (not all, though)

❯ bq --api http://0.0.0.0:9050 query --project_id=test --parameter='states:ARRAY<STRING>:["WA", "WI", "WV", "WY"]' "SELECT * FROM UNNEST(@states)"
BigQuery error in query operation: Error processing job 'local_project:bqjob_r686699b4eec77854_0000018d7dd573cf_1': failed to
analyze: INVALID_ARGUMENT: Values referenced in UNNEST must be arrays. UNNEST contains expression of type INT64 [at 1:22]

❯ bq --api http://0.0.0.0:9050 query --project_id=test --parameter='states:ARRAY<STRING>:["WA", "WI", "WV", "WY"]' "SELECT * FROM UNNEST(CAST(@states AS ARRAY<STRING>))"
+----------+
| $unnest1 |
+----------+
| WA       |
| WI       |
| WV       |
| WY       |
+----------+

The cause of this issue can be summarized as follows: When executing a Parameterized Query from a BigQuery command or BigQuery client, you can specify the type of the parameters. However, go-zetasqlite, which is used internally by the bigquery-emulator, does not utilize this parameter type. As a result, some functions and contexts that strictly check types in zetsql (for example, UNNEST used as an argument in the FROM clause) may cause errors because go-zetasqlite does not pass types internally when analyzing SQL. I am trying to resolve the issue, but it seems that it is not easy to fix this issue because the current interface does not allow passing types to go-zetasqlite. (and it is an interface of the database/sql package in Go...)

totem3 avatar Feb 06 '24 09:02 totem3

It seems that ZetaSQL allows us to specify query parameter types in the AnalyzerOptions, so the crux here does seem to be that we don't have an interface to pass them via the database/sql package.

In goccy/go-zetasqlite#207 I used a context key to pass along an option to the driver. It seems that is an antipattern in Golang, but I wonder if we could do something similar.

Another option would be to expose a method on the zetasqlite connection interface for specifying these values, like we do with NamePath configuration: https://github.com/goccy/bigquery-emulator/blob/main/internal/contentdata/repository.go#L38-L53

ohaibbq avatar Apr 14 '24 06:04 ohaibbq