list cant be passed as the parameter
Versions: Python: 3.12 Databricks SQL Connector: 4.0.0 polars: 1.16.0
I am trying to pass list parametes as it was mentined in docs: https://github.com/databricks/databricks-sql-python/blob/main/docs/parameters.md
Steps to reproduce:
- Create and populate table: ` CREATE TABLE epam.base.tmp_params_table ( product_code VARCHAR(15), qty int, BU VARCHAR(15) );
INSERT INTO epam.base.tmp_params_table (product_code, qty, BU) VALUES ('prod_cd_1', 5, 'BU_1'), ('prod_cd_2', 4, 'BU_1'), ('prod_cd_3', 3, 'BU_1'), ('prod_cd_4', 3, 'BU_2'), ('prod_cd_5', 1, 'BU_2');`
- Run queries using connector: `import polars as pl from configs.databricks_config import databricks_settings from databricks import sql from databricks.sql.parameters import StringParameter, IntegerParameter
query_params_list = [ [ "SELECT product_code, qty, BU from epam.base.tmp_params_table where product_code in (:product_code) and BU= :BU", {"product_code": ["prod_cd_1", "prod_cd_2", "prod_cd_4"], "BU": ["BU_1"]}, ], [ "SELECT product_code, qty, BU from epam.base.tmp_params_table where product_code in (:product_code) and BU= :BU", [ StringParameter(name="product_code", value="prod_cd_1, prod_cd_2, prod_cd_4"), StringParameter(name="BU", value="BU_1"), ], ], [ "SELECT product_code, qty, BU from epam.base.tmp_params_table where qty in %(:qty)s", {"qty": [1, 2, 3]}, ], ["SELECT product_code, qty, BU from :tab_name", {"tab_name": "epam.base.tmp_params_table"}], ]
with sql.connect( server_hostname=databricks_settings.host, http_path=databricks_settings.http_path, access_token=databricks_settings.api_token, ) as connection: with connection.cursor() as cursor: for num, record in enumerate(query_params_list): query, params = record try: cursor.execute(query, params) table = cursor.fetchall_arrow() res_df = pl.from_arrow(table) print(num, res_df.shape, params) except Exception as error: print(num, error, params)`
and I see this error:
databricks.sql.exc.NotSupportedError: Could not infer parameter type from value: ['prod_cd_1', 'prod_cd_2', 'prod_cd_4'] - <class 'list'> Please specify the type explicitly.
Currently, all parameterised queries run in NATIVE mode by default. And we do not support list parameters eg: ['prod_cd_1', 'prod_cd_2', 'prod_cd_4'] in NATIVE mode as of today.
For unblocking your use-case, you can use the INLINE mode for now :
import logging
import polars as pl
from databricks import sql
query_params_list = [
[
"SELECT product_code, qty, BU from epam.base.tmp_params_table where product_code in (%(product_code)s) and BU= %(BU)s",
{"product_code": "'prod_cd_1', 'prod_cd_2', 'prod_cd_4'", "BU": "'BU_1'"},
],
[
"SELECT product_code, qty, BU from epam.base.tmp_params_table where product_code in (%(product_code)s) and BU= %(BU)s",
{"product_code": "'prod_cd_1', 'prod_cd_2', 'prod_cd_4'", "BU": "'BU_1'"},
],
[
"SELECT product_code, qty, BU from epam.base.tmp_params_table where qty in (%(qty)s)",
{"qty": "1, 2, 3"},
],
["SELECT product_code, qty, BU from epam.base.tmp_params_table", {}], # table name cannot be a parameter because of security :(
]
logging.getLogger("databricks.sql").setLevel(logging.DEBUG)
logging.basicConfig(level=logging.DEBUG)
with sql.connect(
server_hostname=databricks_settings.host,
http_path=databricks_settings.http_path,
access_token=databricks_settings.api_token,
use_inline_params=True, ##Add this for inline parameterised query execution
) as connection:
with connection.cursor() as cursor:
for num, record in enumerate(query_params_list):
query, params = record
cursor.execute(query, params)
table = cursor.fetchall_arrow()
res_df = pl.from_arrow(table)
print(num, res_df.shape, params)
Thanks for raising the Issue @arseniy-rylov. We will evaluate a long term fix for this. Let me know if the workaround above works for you.
Since this issue is requesting support for native sequence parameters, is it basically a duplicate of #290?