K-Anonymity
K-Anonymity copied to clipboard
issue in cell 776, missing group by of feature columns
grouped_columns = df.loc[partition].agg(aggregations, squeeze=False)
not working for categorical column, error "unhashable type: 'list'" i tried with features feature_columns = ['age', 'occupation']
Hey Nithin, when i run this command:
dfn = build_anonymized_dataset(df, finished_partitions, feature_columns, sensitive_column)
it shows me this error. can you help me please. Thank you.
TypeError Traceback (most recent call last) D:\anaconda\lib\site-packages\pandas\core\series.py in aggregate(self, func, axis, *args, **kwargs) 3960 try: -> 3961 result = self.apply(func, *args, **kwargs) 3962 except (ValueError, AttributeError, TypeError):
D:\anaconda\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds) 4107 values = self.astype(object)._values -> 4108 mapped = lib.map_infer(values, f, convert=convert_dtype) 4109
pandas_libs\lib.pyx in pandas._libs.lib.map_infer()
TypeError: 'int' object is not iterable
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last) D:\anaconda\lib\site-packages\pandas\core\frame.py in aggregate(self, func, axis, *args, **kwargs) 7574 try: -> 7575 result, how = self._aggregate(func, axis, *args, **kwargs) 7576 except TypeError as err:
D:\anaconda\lib\site-packages\pandas\core\frame.py in _aggregate(self, arg, axis, *args, **kwargs) 7605 return result, how -> 7606 return aggregate(self, arg, *args, **kwargs) 7607
D:\anaconda\lib\site-packages\pandas\core\aggregation.py in aggregate(obj, arg, *args, **kwargs) 565 arg = cast(AggFuncTypeDict, arg) --> 566 return agg_dict_like(obj, arg, _axis), True 567 elif is_list_like(arg):
D:\anaconda\lib\site-packages\pandas\core\aggregation.py in agg_dict_like(obj, arg, _axis) 751 # key used for column selection and output --> 752 results = {key: obj._gotitem(key, ndim=1).agg(how) for key, how in arg.items()} 753
D:\anaconda\lib\site-packages\pandas\core\aggregation.py in
D:\anaconda\lib\site-packages\pandas\core\series.py in aggregate(self, func, axis, *args, **kwargs) 3962 except (ValueError, AttributeError, TypeError): -> 3963 result = func(self, *args, **kwargs) 3964
TypeError: sequence item 0: expected str instance, int found
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
D:\anaconda\lib\site-packages\pandas\core\frame.py in aggregate(self, func, axis, *args, **kwargs) 7579 f"incompatible data and dtype: {err}" 7580 ) -> 7581 raise exc from err 7582 if result is None: 7583 return self.apply(func, axis=axis, args=args, **kwargs)
TypeError: DataFrame constructor called with incompatible data and dtype: sequence item 0: expected str instance, int found
AttributeError: 'list' object has no attribute 'to_dict' How to solve this mistake, please
AttributeError: 'list' object has no attribute 'to_dict' How to solve this mistake, please
I got the same error. this is workaround below.
def agg_categorical_column(series):
# workearound here
series.astype('category')
return [','.join(set(series))]
AttributeError: 'list' object has no attribute 'to_dict' How to solve this mistake, please
I got the same error too. It could be found that the df. agg() returned a Series instead of a Dataframe, so I transformed it. this is workaround below.
grouped_columns = df.loc[partition].agg(aggregations, squeeze=False)
sensitive_counts = df.loc[partition].groupby(sensitive_column).agg({sensitive_column : 'count'})
#insert
df2=grouped_columns.to_frame()
grouped_columns=pd.DataFrame(df2.values.T,columns=df2.index)
#insert_end
I was able to make it work, I had to change the file anonypy.py on the lines 79 and 108
Replacing this line
values = grouped_columns.iloc[0].to_dict()
with this one
values= {}
for name,val in grouped_columns.items():
values[name] = val[0]
Hope it helps.