snowpark-python icon indicating copy to clipboard operation
snowpark-python copied to clipboard

SNOW-1797580: Integer columns contain Na after filtering when using to pandas in local testing

Open frederiksteiner opened this issue 1 year ago • 4 comments

  1. What version of Python are you using?

    Python 3.11.8

  2. What operating system and processor architecture are you using?

    Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35

  3. What are the component versions in the environment (pip freeze)?

    snowflake-connector-python==3.12.3 snowflake-snowpark-python==1.24.0

  4. What did you do?

if __name__ == "__main__":
    from snowflake.snowpark import Session
    import snowflake.snowpark.functions as spf
    conn_params = {
        "schema": "SCHEMA",
        "local_testing": True,
    }

    session = Session.builder.configs(conn_params).create()
    data = [
        [1, False],
        [1, False],
        [1, False],
        [2, True],
    ]
    schema = ["INT_COL", "BOOL_COL"]
    df = session.create_dataframe(data, schema)
    df = df.with_column("INT_COL", spf.cast("INT_COL", "int"))
    filtered = df.filter(
            spf.col("BOOL_COL")
        )
    pd_df = filtered.to_pandas()
    collected = filtered.collect()
  1. What did you expect to see?

    That the pd_df has the same data as collected. But the int column is NaN for the pandas df. I already found the issue and will open a PR asap

frederiksteiner avatar Nov 11 '24 07:11 frederiksteiner