pandas_flavor
pandas_flavor copied to clipboard
How can I distinguish between inplace and copy operations?
Problem description: It's difficult to determine how/when to use inplace or create a copy and the behavior is inconsistent. Does pandas_flavor always require a copy/modifying the original dataframe? Is there a way to avoid this in order to save memory?
import pandas_flavor
# Does not replace original dataframe
@pandas_flavor.register_dataframe_method
def drop_empty_rows(dataframe):
return dataframe.dropna(axis=0, how='all')
# Should replace original dataframe
@pandas_flavor.register_dataframe_method
def drop_empty_rows(dataframe):
dataframe = dataframe.dropna(axis=0, how='all')
return dataframe
# Should replace original dataframe
@pandas_flavor.register_dataframe_method
def drop_empty_rows(dataframe):
dataframe_processed = dataframe.copy()
dataframe_processed = dataframe.dropna(axis=0, how='all')
return dataframe_processed
If I call the first function on a dataframe, it returns the dataframe with dropped rows but does not change the original dataframe.
dict_rows = {}
dict_rows['A'] = [20,numpy.nan,40,10,50]
dict_rows['B'] = [50,numpy.nan,10,40,50]
dict_rows['C'] = [30,numpy.nan,50,40,50]
dataframe = pandas.DataFrame(dict_rows)
This function returns the reduced dataframe, but doesn't affect the original dataframe.
>>> dataframe.drop_empty_rows()
A B C
0 20.0 50.0 30.0
2 40.0 10.0 50.0
3 10.0 40.0 40.0
4 50.0 50.0 50.0
@DOH-Manada pandas-flavor doesnt copy. It just provides a convenient interface. The real work is in pandas. you can also enable pandas copy on write which should take care of copy or not copy decisions for better memory performance