base_generic_feature_statistics_generator.py throws AttributeError
I ran this code on my jupyter notebook:
gfsg = GenericFeatureStatisticsGenerator()
proto = gfsg.ProtoFromDataFrames([{'name': 'data', 'table': df}])
protostr = base64.b64encode(proto.SerializeToString()).decode("utf-8")
HTML_TEMPLATE = """<link rel="import" href="/nbextensions/facets-dist/facets-jupyter.html" >
<facets-overview id="elem1"></facets-overview>
<script>
document.querySelector("#elem1").protoInput = "{protostr}";
</script>"""
html = HTML_TEMPLATE.format(protostr=protostr)
display(HTML(html))
Got this error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-24-e96a5ba70ee7> in <module>()
1 # Calculate the feature statistics proto from the datasets and stringify it for use in facets overview
2 gfsg = GenericFeatureStatisticsGenerator()
----> 3 proto = gfsg.ProtoFromDataFrames([{'name': 'data', 'table': df}])
4 protostr = base64.b64encode(proto.SerializeToString()).decode("utf-8")
5
/Users/Pat/Dev/facets/facets_overview/python/base_generic_feature_statistics_generator.py in ProtoFromDataFrames(self, dataframes, histogram_categorical_levels_count)
59 return self.GetDatasetsProto(
60 datasets,
---> 61 histogram_categorical_levels_count=histogram_categorical_levels_count)
62
63 def DtypeToType(self, dtype):
/Users/Pat/Dev/facets/facets_overview/python/base_generic_feature_statistics_generator.py in GetDatasetsProto(self, datasets, features, histogram_categorical_levels_count)
261 for item in value['vals']:
262 strs.append(item if hasattr(item, '__len__') else
--> 263 item.encode('utf-8'))
264
265 featstats.avg_length = np.mean(np.vectorize(len)(strs))
AttributeError: 'int' object has no attribute 'encode'
Using this advice on StackOverflow, to fix this I changed line 263 of base_generic_feature_statistics_generator.py to:
for item in value['vals']:
strs.append(item if hasattr(item, '__len__') else
repr(item).encode('utf-8')) # added repr(item)
I added the repr() to the item before the encode() and that seems to work now. I think y'all aren't accepting pull requests so here is my bug; submitting mainly for the folks who encounter this on google!
To get it working I had to delete the /Users/Pat/Dev/facets/facets_overview/python/base_generic_feature_statistics_generator.pyc file and also restart the notebook kernel.
Perhaps there is a better way to handle this? Let me know, I'm rather ignorant of the bug filing process.
Thanks for the bug report!
What is the dataframe from which you were generating feature statistics? Is it private, or would you be willing to share it to help debug this? It seems like in your case, the feature (column) of the dataframe on which it failed was some type of int, but facets' feature statistics generator thought it was some sort of non-numeric type (otherwise it wouldn't be executing line 263 of base_generic_feature_statistics_generator.py).
Facets decides if a feature in a dataframe is a numeric or non-numeric type using this logic: https://github.com/PAIR-code/facets/blob/master/facets_overview/python/base_generic_feature_statistics_generator.py#L63