analytics-componentized-patterns
analytics-componentized-patterns copied to clipboard
"SyntaxError: --params is not a correctly formatted JSON string or a JSON serializable dictionary" When running LTV lookalike notebook
Hi all,
I'm running the LTV lookalike notebook: https://github.com/GoogleCloudPlatform/analytics-componentized-patterns/blob/master/retail/ltv/bqml/notebooks/bqml_automl_ltv_activate_lookalike.ipynb on a Vertex AI user-managed notebook, but I got the error as written in the title.
I got the error when running the first cell of the Aggregate per day per customer chapter, i.e. the cell starting with this code:
%%bigquery --params $LTV_PARAMS --project $PROJECT_ID
DECLARE MAX_STDV_MONETARY INT64 DEFAULT @MAX_STDV_MONETARY;
DECLARE MAX_STDV_QTY INT64 DEFAULT @MAX_STDV_QTY;
CREATE OR REPLACE TABLE `ltv_ecommerce.20_aggred` AS
SELECT
customer_id,
order_day,
ROUND(day_value_after_returns, 2) AS value,
day_qty_after_returns as qty_articles,
day_num_returns AS num_returns,
CEIL(avg_time_to_return) AS time_to_return
FROM (
.....
Does anyone know how to fix this?
Here is the full error message I got:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/magics.py", line 515, in _cell_magic
params_option_value, rest_of_args = _split_args_line(line)
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/magics.py", line 720, in _split_args_line
tree = scanner.input_line()
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/line_arg_parser/parser.py", line 203, in input_line
options = self.option_list()
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/line_arg_parser/parser.py", line 264, in option_list
option = self.params_option()
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/line_arg_parser/parser.py", line 319, in params_option
opt_value = self.py_dict()
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/line_arg_parser/parser.py", line 333, in py_dict
dict_items = self.dict_items()
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/line_arg_parser/parser.py", line 347, in dict_items
item = self.dict_item()
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/line_arg_parser/parser.py", line 371, in dict_item
value = self.py_value()
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/line_arg_parser/parser.py", line 423, in py_value
self.error(msg, exc_type=QueryParamsParseError)
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/line_arg_parser/parser.py", line 193, in error
raise exc_type(message)
QueryParamsParseError: Unexpected token type UNKNOWN at position 27.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3457, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/tmp/ipykernel_17444/3535280929.py", line 1, in <module>
get_ipython().run_cell_magic('bigquery', '--params $LTV_PARAMS --project $PROJECT_ID', '\nDECLARE MAX_STDV_MONETARY INT64 DEFAULT @MAX_STDV_MONETARY;\nDECLARE MAX_STDV_QTY INT64 DEFAULT @MAX_STDV_QTY;\n\nCREATE OR REPLACE TABLE `ltv_ecommerce.20_aggred` AS\nSELECT\n customer_id,\n order_day,\n ROUND(day_value_after_returns, 2) AS value,\n day_qty_after_returns as qty_articles,\n day_num_returns AS num_returns,\n CEIL(avg_time_to_return) AS time_to_return\nFROM (\n SELECT\n customer_id,\n order_day,\n SUM(order_value_after_returns) AS day_value_after_returns,\n STDDEV(SUM(order_value_after_returns)) OVER(PARTITION BY customer_id ORDER BY SUM(order_value_after_returns)) AS stdv_value,\n SUM(order_qty_after_returns) AS day_qty_after_returns,\n STDDEV(SUM(order_qty_after_returns)) OVER(PARTITION BY customer_id ORDER BY SUM(order_qty_after_returns)) AS stdv_qty,\n CASE\n WHEN MIN(order_min_qty) < 0 THEN count(1)\n ELSE 0\n END AS day_num_returns,\n CASE\n WHEN MIN(order_min_qty) < 0 THEN AVG(time_to_return)\n ELSE NULL\n END AS avg_time_to_return\n FROM (\n SELECT \n customer_id,\n order_id,\n -- Gives the order date vs return(s) dates.\n MIN(transaction_date) AS order_day,\n MAX(transaction_date) AS return_final_day,\n DATE_DIFF(MAX(transaction_date), MIN(transaction_date), DAY) AS time_to_return,\n -- Aggregates all products in the order \n -- and all products returned later.\n SUM(qty * unit_price) AS order_value_after_returns,\n SUM(qty) AS order_qty_after_returns,\n -- If negative, order has qty return(s).\n MIN(qty) order_min_qty\n FROM \n `ltv_ecommerce.10_orders`\n GROUP BY\n customer_id,\n order_id)\n GROUP BY\n customer_id,\n order_day)\nWHERE\n -- [Optional] Remove dates with outliers per a customer.\n (stdv_value < MAX_STDV_MONETARY\n OR stdv_value IS NULL) AND\n (stdv_qty < MAX_STDV_QTY\n OR stdv_qty IS NULL);\n\n\nSELECT * FROM `ltv_ecommerce.20_aggred` LIMIT 5;\n')
File "/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2419, in run_cell_magic
result = fn(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/magics/magics.py", line 521, in _cell_magic
raise rebranded_error from exc
File "<string>", line unknown
SyntaxError: --params is not a correctly formatted JSON string or a JSON serializable dictionary
Hi rain-ml, It looks like you may not have run one of the prior cells that creates the LTV_PARAMS dictionary:
LTV_PARAMS = {
'WINDOW_LENGTH': 0,
'WINDOW_STEP': 30,
'WINDOW_STEP_INITIAL': 90,
'LENGTH_FUTURE': 30,
'MAX_STDV_MONETARY': 500,
'MAX_STDV_QTY': 100,
'TOP_LTV_RATIO': 0.2
}
LTV_PARAMS
In other words, in the first line of your code, %%bigquery is expecting to import a dictionary which passes in variables that can then be used in the SQL code below it. This is why your output says:
--params is not a correctly formatted JSON string or a JSON serializable dictionary
Once you create the LTV_PARAMS dictionary and use it as part of %%bigquery, it should then allow you to reference the two variables @MAX_STDV_MONETARY @MAX_STDV_QTY which were set in LTV_PARAMS above:
DECLARE MAX_STDV_MONETARY INT64 DEFAULT @MAX_STDV_MONETARY;
DECLARE MAX_STDV_QTY INT64 DEFAULT @MAX_STDV_QTY;
If you don't like passing in LTV_PARAMS
, you can also remove it and replace it with values directly, such as:
%%bigquery --project $PROJECT_ID
DECLARE MAX_STDV_MONETARY INT64 DEFAULT 500;
DECLARE MAX_STDV_QTY INT64 DEFAULT 100;
CREATE OR REPLACE TABLE `ltv_ecommerce.20_aggred` AS
SELECT
customer_id,
order_day,
ROUND(day_value_after_returns, 2) AS value,
day_qty_after_returns as qty_articles,
day_num_returns AS num_returns,
CEIL(avg_time_to_return) AS time_to_return
FROM (
.....
Hope this helps
Hi polong-lin,
I did run the LTV_PARAMS cell before referencing it and that's how I got the original error.
In fact, If I didn't run the LTV_PARAMS cell, here's the error message:
NameError: Parameter expansion failed, undefined variable "LTV_PARAMS".
Anyways, I just hard code it for now like you suggested without using params. Thanks!