snowpark-python icon indicating copy to clipboard operation
snowpark-python copied to clipboard

SNOW-823678: Creating Python sprocs inside of a Python sproc throws an error

Open teej opened this issue 1 year ago • 0 comments

  1. What version of Python are you using?
Python 3.8.16 (default, May 18 2023, 16:59:27) 
[Clang 14.0.0 (clang-1400.0.29.202)]
  1. What operating system and processor architecture are you using?
macOS-13.2.1-x86_64-i386-64bit
  1. What are the component versions in the environment (pip freeze)?
snowflake-snowpark-python==1.4.0
  1. What did you do?

Consider a Python stored procedure (outer_sproc) that creates a new stored procedure (inner_sproc).

create or replace procedure outer_sproc()
  returns VARIANT
  language python
  runtime_version = '3.8'
  packages = ('snowflake-snowpark-python')
  handler = 'main'
  EXECUTE AS CALLER
AS
$$
import snowflake.snowpark

def inner_sproc(session: snowflake.snowpark.session.Session) -> int:
  return 1

def main(session: snowflake.snowpark.session.Session) -> dict:
  session.sql("CREATE STAGE IF NOT EXISTS sprocs").collect()
  session.add_packages("snowflake-snowpark-python")
  # Uncomment this line to fix the issue
  # inner_sproc.__module__ = '__main__'
  _inner_sproc = session.sproc.register(
      inner_sproc,
      name="inner_sproc",
      replace=True,
      is_permanent=True,
      stage_location="@sprocs",
  )
  return {"result": _inner_sproc()}
$$;

When called, outer_sproc throws an error.

CALL outer_sproc();

Traceback (most recent call last):
...
    raise error_class(
snowflake.snowpark.exceptions.SnowparkSQLException: (1304): 01ac66a5-0001-1e54-0001-ed960006c22e: 100357 (P0000): Python Interpreter Error:
Traceback (most recent call last):
  File "_udf_code.py", line 4, in <module>
ModuleNotFoundError: No module named 'main_module'
 in function INNER_SPROC with handler compute
 in function OUTER_SPROC with handler main

The root cause lies in how snowpark-python pickles functions in the context of a stored procedure.

snowpark-python calls func.__module__ (see https://github.com/snowflakedb/snowpark-python/blob/main/src/snowflake/snowpark/_internal/code_generation.py#L513) to determine what module the in-memory function currently lives in. In the context of a stored procedure however, that in-memory function lives in the special cloudpickle module main_module. That gets incorrectly carried into the code generated for inner_sproc.

  1. What did you expect to see?

By manually setting the python function's module to __main__, the issue is resolved.

CALL outer_sproc(); // => {"result": 1}
  1. Can you set logging to DEBUG and collect the logs?

Full Traceback

Traceback (most recent call last):
  File "_udf_code.py", line 11, in main
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/stored_procedure.py", line 472, in register
    return self._do_register_sp(
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/stored_procedure.py", line 722, in _do_register_sp
    raise ne.with_traceback(tb) from None
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/stored_procedure.py", line 695, in _do_register_sp
    create_python_udf_or_sp(
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/_internal/udf_utils.py", line 712, in create_python_udf_or_sp
    session._run_query(create_query, is_ddl_on_temp_object=is_temporary)
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/session.py", line 1231, in _run_query
    return self._conn.run_query(
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/_internal/server_connection.py", line 102, in wrap
    raise ex
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/_internal/server_connection.py", line 96, in wrap
    return func(*args, **kwargs)
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/_internal/server_connection.py", line 365, in run_query
    raise ex
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/snowpark/_internal/server_connection.py", line 346, in run_query
    results_cursor = self._cursor.execute(query, params=params, **kwargs)
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 829, in execute
    Error.errorhandler_wrapper(
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/connector/errors.py", line 232, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/connector/errors.py", line 287, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/usr/lib/python_udf/a35acd1b5955d1aa8bb276bee1e3596a7199ba3d16f4897b8ee20b520fbe9d4e/lib/python3.8/site-packages/snowflake/connector/errors.py", line 165, in default_errorhandler
    raise error_class(
snowflake.snowpark.exceptions.SnowparkSQLException: (1304): 01ac66a5-0001-1e54-0001-ed960006c22e: 100357 (P0000): Python Interpreter Error:
Traceback (most recent call last):
  File "_udf_code.py", line 4, in <module>
ModuleNotFoundError: No module named 'main_module'
 in function INNER_SPROC with handler compute
 in function OUTER_SPROC with handler main

teej avatar May 20 '23 00:05 teej