openai-cookbook
openai-cookbook copied to clipboard
df.ada_similarity.apply(eval).apply(np.array) is returning an error
I'm getting an error when running the line df["ada_similarity"] = df.ada_similarity.apply(eval).apply(np.array) from example https://github.com/openai/openai-cookbook/blob/main/examples/Clustering.ipynb. The error I'm getting is:
eval() arg 1 must be a string, bytes or code object
Full error:
tmp/ipykernel_45192/3289201929.py in
/apps/python3/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwargs) 4355 dtype: float64 4356 """ -> 4357 return SeriesApply(self, func, convert_dtype, args, kwargs).apply() 4358 4359 def _reduce(
/apps/python3/lib/python3.7/site-packages/pandas/core/apply.py in apply(self) 1041 return self.apply_str() 1042 -> 1043 return self.apply_standard() 1044 1045 def agg(self):
/apps/python3/lib/python3.7/site-packages/pandas/core/apply.py in apply_standard(self) 1099 values, 1100 f, # type: ignore[arg-type] -> 1101 convert=self.convert_dtype, 1102 ) 1103
/apps/python3/lib/python3.7/site-packages/pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
TypeError: eval() arg 1 must be a string, bytes or code object
This can be fixed via df["ada_similarity"] = df.ada_similarity.apply(eval).apply(np.array).apply(lambda x: x.astype(float))
Also, kmeans = KMeans(n_clusters=n_clusters, init="k-means++", random_state=42, n_init='auto') doesn't work, should be: kmeans = KMeans(n_clusters=n_clusters, init="k-means++", random_state=42, n_init=10)
@yzvickie Thanks for sharing! Although I think the first line has to be
df["ada_similarity"] = df.apply(lambda x: x.astype(float)).apply(np.array)
Otherwise ya might get the same error
Thanks! Will fix.
Looks like n_init='auto' requires scikit-learn 1.2.0+. I'll omit it so that it just uses the default for whichever version of scikit-learn folks are using.
I think this is now fixed (#66). The code runs for me. Let me know if it's still throwing an error for you. If so, I'd guess it's from using a different version of one of these libraries. Happy to look into it further if it's still presenting difficulties.