K-Anonymity
K-Anonymity copied to clipboard
Anonymization methods for network security.
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in ----> 1 dfn = build_anonymized_dataset(df, finished_partitions, feature_columns, sensitive_column) in build_anonymized_dataset(df, partitions, feature_columns, sensitive_column, max_partitions) 14 grouped_columns = df.loc[partition].agg(aggregations, squeeze=False) 15 sensitive_counts =...
grouped_columns = df.loc[partition].agg(aggregations, squeeze=False)
I tried converting this python code to pyspark code. I am running the same dataset with pyspark code in AWS EMR cluster. For 200 records it was taking 9 minutes...
replace line 173 with : values = {'age' : grouped_columns[0], 'education-num' : grouped_columns[0]} now we no longer have the error list object has no attribute 'to_dict()' then while printing your...