openai-cookbook icon indicating copy to clipboard operation
openai-cookbook copied to clipboard

User and product embeddings unclear

Open EDGDrummond opened this issue 2 years ago • 5 comments

In the 'User_and_product_embeddings.ipynb' there is a requirement to load 'output/embedded_babbage_similarity_50k.csv'. A comment states that this file needs to be generated in advance, but there is no clear file to use to generate this data from. A link or explanation of where to find it would be helpful.

EDGDrummond avatar Jan 04 '23 12:01 EDGDrummond

Will fix. Thanks!

ted-at-openai avatar Jan 10 '23 00:01 ted-at-openai

Hello @ted-at-openai

I wasn't able to find the file neither

brunobelloni avatar Mar 16 '23 14:03 brunobelloni

Same here. It looks like it would be a great example to run, but the data is unavailable.

wilmerhenao avatar Apr 19 '23 19:04 wilmerhenao

@ted-at-openai any chance this has been fixed yet?

chrisbrody avatar May 30 '23 16:05 chrisbrody

Sorry, hasn't been at the top of my priority list, even though I'm aware it's a deficiency. If any of you want to fix, I'm happy to accept a PR.

ted-at-openai avatar Jun 21 '23 16:06 ted-at-openai

Sorry, hasn't been at the top of my priority list, even though I'm aware it's a deficiency. If any of you want to fix, I'm happy to accept a PR.

@ted-at-openai Dataset update #535

brunobelloni avatar Jun 22 '23 00:06 brunobelloni

Hey! folks in another issue, they have said they won't be able to fix it.

But, its pretty simple, you have to just create embeddings of the data as told in Obtain_dataset exercise. And then you have to convert those embeddings to numpy arrays. Then just take mean of those embeddings group-wise and from axis=0 EmbeddingsTut.zip

I have attached my jupyter notebook, go through it:

FarziBuilder avatar Aug 16 '23 11:08 FarziBuilder