K-Anonymity icon indicating copy to clipboard operation
K-Anonymity copied to clipboard

issue in cell 776, missing group by of feature columns

Open balalavanya opened this issue 4 years ago • 6 comments

    grouped_columns = df.loc[partition].agg(aggregations, squeeze=False)

balalavanya avatar Sep 01 '20 10:09 balalavanya

not working for categorical column, error "unhashable type: 'list'" i tried with features feature_columns = ['age', 'occupation']

balalavanya avatar Sep 01 '20 11:09 balalavanya

Hey Nithin, when i run this command:

dfn = build_anonymized_dataset(df, finished_partitions, feature_columns, sensitive_column)

it shows me this error. can you help me please. Thank you.

TypeError Traceback (most recent call last) D:\anaconda\lib\site-packages\pandas\core\series.py in aggregate(self, func, axis, *args, **kwargs) 3960 try: -> 3961 result = self.apply(func, *args, **kwargs) 3962 except (ValueError, AttributeError, TypeError):

D:\anaconda\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds) 4107 values = self.astype(object)._values -> 4108 mapped = lib.map_infer(values, f, convert=convert_dtype) 4109

pandas_libs\lib.pyx in pandas._libs.lib.map_infer()

in agg_categorical_column(series) 1 def agg_categorical_column(series): ----> 2 return [','.join(set(series))] 3

TypeError: 'int' object is not iterable

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last) D:\anaconda\lib\site-packages\pandas\core\frame.py in aggregate(self, func, axis, *args, **kwargs) 7574 try: -> 7575 result, how = self._aggregate(func, axis, *args, **kwargs) 7576 except TypeError as err:

D:\anaconda\lib\site-packages\pandas\core\frame.py in _aggregate(self, arg, axis, *args, **kwargs) 7605 return result, how -> 7606 return aggregate(self, arg, *args, **kwargs) 7607

D:\anaconda\lib\site-packages\pandas\core\aggregation.py in aggregate(obj, arg, *args, **kwargs) 565 arg = cast(AggFuncTypeDict, arg) --> 566 return agg_dict_like(obj, arg, _axis), True 567 elif is_list_like(arg):

D:\anaconda\lib\site-packages\pandas\core\aggregation.py in agg_dict_like(obj, arg, _axis) 751 # key used for column selection and output --> 752 results = {key: obj._gotitem(key, ndim=1).agg(how) for key, how in arg.items()} 753

D:\anaconda\lib\site-packages\pandas\core\aggregation.py in (.0) 751 # key used for column selection and output --> 752 results = {key: obj._gotitem(key, ndim=1).agg(how) for key, how in arg.items()} 753

D:\anaconda\lib\site-packages\pandas\core\series.py in aggregate(self, func, axis, *args, **kwargs) 3962 except (ValueError, AttributeError, TypeError): -> 3963 result = func(self, *args, **kwargs) 3964

in agg_categorical_column(series) 1 def agg_categorical_column(series): ----> 2 return [','.join(set(series))] 3

TypeError: sequence item 0: expected str instance, int found

The above exception was the direct cause of the following exception:

TypeError Traceback (most recent call last) in ----> 1 dfn = build_anonymized_dataset(df, finished_partitions, feature_columns, sensitive_column)

in build_anonymized_dataset(df, partitions, feature_columns, sensitive_column, max_partitions) 12 if max_partitions is not None and i > max_partitions: 13 break ---> 14 grouped_columns = df.loc[partition].agg(aggregations, squeeze=False) 15 sensitive_counts = df.loc[partition].groupby(sensitive_column).agg({sensitive_column : 'count'}) 16 values = grouped_columns.iloc[0].to_dict()

D:\anaconda\lib\site-packages\pandas\core\frame.py in aggregate(self, func, axis, *args, **kwargs) 7579 f"incompatible data and dtype: {err}" 7580 ) -> 7581 raise exc from err 7582 if result is None: 7583 return self.apply(func, axis=axis, args=args, **kwargs)

TypeError: DataFrame constructor called with incompatible data and dtype: sequence item 0: expected str instance, int found

pablomoha avatar Mar 10 '21 17:03 pablomoha

AttributeError: 'list' object has no attribute 'to_dict' How to solve this mistake, please

Arigato97 avatar May 12 '21 05:05 Arigato97

AttributeError: 'list' object has no attribute 'to_dict' How to solve this mistake, please

I got the same error. this is workaround below.

def agg_categorical_column(series):
    # workearound here
    series.astype('category')
    return [','.join(set(series))]

glassonion1 avatar Oct 20 '21 10:10 glassonion1

AttributeError: 'list' object has no attribute 'to_dict' How to solve this mistake, please

I got the same error too. It could be found that the df. agg() returned a Series instead of a Dataframe, so I transformed it. this is workaround below.

grouped_columns = df.loc[partition].agg(aggregations, squeeze=False)
sensitive_counts = df.loc[partition].groupby(sensitive_column).agg({sensitive_column : 'count'})
#insert
df2=grouped_columns.to_frame()
grouped_columns=pd.DataFrame(df2.values.T,columns=df2.index)
#insert_end

XDUqinian avatar May 30 '22 15:05 XDUqinian

I was able to make it work, I had to change the file anonypy.py on the lines 79 and 108

Replacing this line

values = grouped_columns.iloc[0].to_dict()

with this one

values= {} for name,val in grouped_columns.items(): values[name] = val[0]

Hope it helps.

IamWorld avatar Mar 01 '24 05:03 IamWorld