snowpark-python
snowpark-python copied to clipboard
SNOW-1075566: Patching function with no argument
Hi
I was following instructions how to patch built-in functions ( https://docs.snowflake.com/en/developer-guide/snowpark/python/testing-locally#patching-built-in-functions) however I am not sure how to do that for current_date() function.
This is how I have approached that:
@patch(current_date)
def patch_current_date() -> ColumnEmulator:
ret_column = ColumnEmulator(data=[datetime.date.today()])
ret_column.sf_type = ColumnType(DateType(), True)
return ret_column
but this only fills the first row of dataframe. Rest of the rows for that column will be NA.
This is how that specific code line looks like in my test function:
input_df.with_column('CURRENT_DATE', current_date())
hi @petsvakala , thanks for reaching out. @sfc-gh-jrose I know you added the support to current_date recently, can you help take a look at this issue to see if this is covered?
I did add current_date
recently, but it hasn't made it into a release yet I don't think. I believe the issue in this bug is this line:
ret_column = ColumnEmulator(data=[datetime.date.today()])
The column emulator assumes that the data is the same length as the column and inserts None
if no data remains in the list. If you remove the list braces it will instead be a single value that is used for all entries in the column instead.
ret_column = ColumnEmulator(data=datetime.date.today())
Thank you for quick reply. I tried what you recommended (removed braces) but still face same issue:
@patch(current_date)
def patch_current_date() -> ColumnEmulator:
ret_column = ColumnEmulator(data=datetime.date.today())
ret_column.sf_type = ColumnType(DateType(), True)
return ret_column
I was wrong. This appears to be a gap in the local testing API. I'll see if support can be added by the next release.
Ok, what close status should I choose for this issue for time being or I will keep it open?
@petsvakala -- Can you retry with v1.14.0? It should be fixed now
Hi Unfortunately still same behaviour.
Here you can see new package exists: poetry show snowflake-snowpark-python
name : snowflake-snowpark-python
version : 1.14.0
description : Snowflake Snowpark for Python
dependencies
- cloudpickle >=1.6.0,<2.1.0 || >2.1.0,<2.2.0 || >2.2.0,<=2.2.1
- cloudpickle 2.2.1
- pyyaml *
- setuptools >=40.6.0
- snowflake-connector-python >=3.6.0,<4.0.0
- typing-extensions >=4.1.0,<5.0.0
Here is minimum code for reproducability:
import pytest
import pandas as pd
import snowflake.snowpark.session as ses
from snowflake.snowpark.functions import current_date
@pytest.mark.data_processing
def test_calculate_rfm(request, session: ses.Session) -> None:
if request.config.getoption('--snowflake-session') == 'local':
from tests.patches import patch_current_date
ID = ["A1", "A1", "A2"]
ORDER_TOTAL = [50.0, 50.0, 80.0]
dict = {'ID': ID, 'ORDER_TOTAL': ORDER_TOTAL}
df = pd.DataFrame(dict)
input_df = session.create_dataframe(df)
# This only assign current date to first row
snowpark_df = (input_df.with_column('CURRENT_DATE', current_date())).to_pandas()
# here I create two row dataframe but only single row is returned
snowpark_df2 = session.create_dataframe([[1, 'a', True], [3, 'b', False]]).select(current_date()).to_pandas()
assert 1 == 1
This is how patching looks at the moment:
@patch(current_date)
def patch_current_date() -> ColumnEmulator:
ret_column = ColumnEmulator(data=datetime.date.today())
ret_column.sf_type = ColumnType(DateType(), True)
return ret_column
hey @petsvakala , we have added new features to our patching functions to pass length of rows, also now we have built-in mocking support for the current_date
function so that you need to patch by yourself: https://github.com/snowflakedb/snowpark-python/blob/main/src/snowflake/snowpark/mock/_functions.py#L523-L528
could you try upgrading to the latest version of snowpark python and see if it helps resolve the issue?
Hi @sfc-gh-aling , Yes now it seems to be working. Thank you again!