snowpark-python icon indicating copy to clipboard operation
snowpark-python copied to clipboard

SNOW-902662: DF.to_pandas_batches() batch size parameter

Open ghost opened this issue 2 years ago • 2 comments

Current behaviour

When using DataFrame.to_pandas_batches() it returns a Pandas Dataframe Iterator that generates Pandas Dataframes with a "random" number of rows.

Desired behaviour

I would like to insert a parameter in the to_pandas_batches() method, where I fix the number of rows for each Pandas Dataframe generated.

How would this improve snowflake-snowpark-python?

This would be beneficial since the user would be able to control the chunk sizes to process and be sure that his processes don't get overloaded by the snowflake back-end's calculation of the number of rows.

ghost avatar Aug 29 '23 12:08 ghost

Hi @MarcoFreitas0 , I would like to have a look on this issue!

RahulDubey391 avatar Nov 27 '23 07:11 RahulDubey391

I am also interested in this feature

stong1108 avatar Mar 07 '24 21:03 stong1108