openai-cookbook icon indicating copy to clipboard operation
openai-cookbook copied to clipboard

Q/A Embeddings Function constantly fails

Open capri-ai-us opened this issue 2 years ago • 4 comments

I am fairly new to Python, but trying to follow along with the Colab notebook exactly, and for some reason no matter what I try, I continuously get the following error :

----> 1 def load_embeddings(fname: str) -> dict[tuple[str, str], list[float]]: 2 """ 3 Read the document embeddings and their keys from a CSV. 4 5 fname is the path to a CSV with exactly these named columns:

TypeError: 'type' object is not subscriptable

For reference, this is from cell number 7 and 8 from this notebook: Screenshot 2023-02-01 at 1 53 02 PM

https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb

Any help is greatly appreciated, I've solved for the embeddings at this point and just want to try and calculate nearest K from the results, but it just won't even seem to initialize the function

capri-ai-us avatar Feb 01 '23 18:02 capri-ai-us

anyone have insight on this? I have same problem.

rnehrboss avatar Feb 03 '23 15:02 rnehrboss

I put the function and your question in ChatGPT, and this is the response:

The error message is indicating that the square bracket syntax [] used on the dict type is not allowed in Python. This syntax is used to specify the types of the keys and values in the dictionary. However, in this case, it is not necessary. The correct way to specify the type of the returned value is simply dict, like this:

def load_embeddings(fname: str) -> dict:
   """
   Read the document embeddings and their keys from a CSV.
   
   fname is the path to a CSV with exactly these named columns: 
       "title", "heading", "0", "1", ... up to the length of the embedding vectors.
   """
   
   df = pd.read_csv(fname, header=0)
   max_dim = max([int(c) for c in df.columns if c != "title" and c != "heading"])
   return {
          (r.title, r.heading): [r[str(i)] for i in range(max_dim + 1)] for _, r in df.iterrows()
   }

So essentially ChatGPT said to remove the generics for the dict return type. Did that fix it for you? 😅

Or, if that doesn't help, then I think you need to double check that your Python version in Colab is 3.9+ since I've seen some typing issues with older Python versions. Source: https://stackoverflow.com/questions/71041586/typeerror-type-object-is-not-subscriptable#comment125585318_71041612

jchiare avatar Feb 03 '23 17:02 jchiare

Sorry for the trouble caused by type hinting. Most of the time this is caused by running a version of Python 3.7 or older. Deleting the the type hints should eliminate the errors without compromising the code's functionality. Let me know if any assistance is still needed.

ted-at-openai avatar Feb 06 '23 18:02 ted-at-openai

I use Python 3.8. It needs to change like this:

from typing import Tuple, List

def strings_ranked_by_relatedness(
# ...
) -> Tuple[List[str], List[float]]:

aquastartw avatar Apr 24 '23 12:04 aquastartw

Sorry, I should have said Python 3.8 or older. Built in common types were added in 3.9. Hope you figured it out.

ted-at-openai avatar Jun 21 '23 17:06 ted-at-openai