KeyBERT icon indicating copy to clipboard operation
KeyBERT copied to clipboard

KeyLLM seems to use OpenAI parameters that are deprecated

Open lfoppiano opened this issue 2 years ago • 14 comments

First of all, this tool is amazing :-)

I'm trying to use keyLLM using OpenAI API, but when I import the OpenAI module from keybert, I cannot not noticed that the default parameters look having quite old defaults, something like "gpt-3.5-instruct".

The code is something like this:

from keybert.llm import OpenAI

lc_chatgpt = OpenAI(model="gpt-3.5-turbo")
kw_model = KeyLLM(llm=lc_chatgpt)

[...]

keywords_abstracts = kw_model.extract_keywords(abstracts, embeddings=embeddings_abstracts, threshold=0.9)

When trying following your instructions I get a deprecation error:

openai.lib._old_api.APIRemovedInV1: 

You tried to access openai.Completion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742

here the libraries versions:

openai                             1.3.3
keybert                            0.8.3

Thank you in advance

lfoppiano avatar Nov 20 '23 23:11 lfoppiano

Ah, that is correct! It seems that openai has updated their package with some breaking changes. Perhaps if you set openai to 0.28, it might just work. I'll make sure to update the backend so that it works with their newest release. That likely will introduce a breaking change since I want to only support openai>1.

MaartenGr avatar Nov 29 '23 08:11 MaartenGr

@lfoppiano I just pushed a fix to #189, if you have the time. Could you check whether it works for you?

MaartenGr avatar Nov 29 '23 08:11 MaartenGr

Hi,

I cant seem to get it to work.

I've installed keybert and openai as follows:

pip install keybert
pip install openai

The versions are:

keybert                   0.8.3
openai                    1.3.7

I've subsequently run the following:

import openai
from keybert.llm import OpenAI
from keybert import KeyLLM

client = openai.OpenAI(api_key=OpenAI.api_key)
llm = OpenAI(client)
kw_model = KeyLLM(llm)

[...]

keywords = kw_model.extract_keywords(docs, check_vocab=True)

However, I end up with the following error:

---------------------------------------------------------------------------
APIRemovedInV1                            Traceback (most recent call last)
Cell In[105], line 2
      1 # Extract keywords
----> 2 keywords = kw_model.extract_keywords(docs, check_vocab=True)

File ~\.conda\envs\PhDProjectsWork\Lib\site-packages\keybert\_llm.py:126, in KeyLLM.extract_keywords(self, docs, check_vocab, candidate_keywords, threshold, embeddings)
    123         keywords = [in_cluster_keywords[index] for index in range(len(docs))]
    124 else:
    125     # Extract keywords using a Large Language Model (LLM)
--> 126     keywords = self.llm.extract_keywords(docs, candidate_keywords)
    128 # Only extract keywords that appear in the input document
    129 if check_vocab:

File ~\.conda\envs\PhDProjectsWork\Lib\site-packages\keybert\llm\_openai.py:177, in OpenAI.extract_keywords(self, documents, candidate_keywords)
    175         response = chat_completions_with_backoff(**kwargs)
    176     else:
--> 177         response = openai.ChatCompletion.create(**kwargs)
    178     keywords = response["choices"][0]["message"]["content"].strip()
    180 # Use a non-chat model
    181 else:

File ~\.conda\envs\PhDProjectsWork\Lib\site-packages\openai\lib\_old_api.py:39, in APIRemovedInV1Proxy.__call__(self, *_args, **_kwargs)
     38 def __call__(self, *_args: Any, **_kwargs: Any) -> Any:
---> 39     raise APIRemovedInV1(symbol=self._symbol)

APIRemovedInV1: 

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface. 

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742

Reading the help documentation for OpenAI with help(OpenAI) it shows:

|  Using the OpenAI API to extract keywords
 |  
 |      The default method is `openai.Completion` if `chat=False`.
 |      The prompts will also need to follow a completion task. If you
 |      are looking for a more interactive chats, use `chat=True`
 |      with `model=gpt-3.5-turbo`.

This would suggest that the error received is correct as openai.Completion is deprecated.

I thought the fix applied works for openai >1.0 ? Could you help clarify what I'm not doing correctly?

On the other hand, if I try the following:

client = openai.OpenAI(api_key=OpenAI.api_key)
llm = OpenAI(client, chat=True, model="gpt-3.5-turbo")
kw_model = KeyLLM(llm)

I end up with the following error instead

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[108], line 3
      1 # Create LLM
      2 client = openai.OpenAI(api_key=OpenAI.api_key)
----> 3 llm = OpenAI(client, chat=True, model="gpt-3.5-turbo")
      5 # Load it in KeyLLM
      6 kw_model = KeyLLM(llm)

TypeError: OpenAI.__init__() got multiple values for argument 'model'

What am I doing wrong?

adegboyegaFAU avatar Dec 08 '23 11:12 adegboyegaFAU

@adegboyegaFAU You are not using the fix. To install the fix, you should run the following instead:

pip install -U git+https://github.com/MaartenGr/KeyBERT@openai_fix

MaartenGr avatar Dec 08 '23 11:12 MaartenGr

Works now! Thanks @MaartenGr. I'd actually previously tried that from @lfoppiano's post on fix #189 and it didn't work. Turns out that what I didn't do after uninstalling keybert then was to restart anaconda.

I really do love the tool by the way. Great work

adegboyegaFAU avatar Dec 08 '23 12:12 adegboyegaFAU

@MaartenGr any estimate on when this fix will be released?

lfoppiano avatar Dec 13 '23 01:12 lfoppiano

@lfoppiano I just pushed the fix to the main branch, an official release will follow either this or next week.

MaartenGr avatar Dec 13 '23 06:12 MaartenGr

Great, thanks! I've been testing it extensively these days and works fine

lfoppiano avatar Dec 13 '23 08:12 lfoppiano

@MaartenGr the fix is not yet released, right?

fabmeyer avatar Feb 15 '24 09:02 fabmeyer

@lfoppiano Can you provide a minimal working example? I am running into problems when using openai LLM for keyword generation.

openai.api_key = os.getenv('OPENAI_API_KEY')
llm = OpenAI(
    client = openai,
    model = "gpt-3.5-turbo-instruct",
    prompt = "Summarize the following text of keywords with a maximum of 5 keywords: \n\n-",
    chat = False,
    verbose = False,
    )

kw_model_2 = KeyLLM(llm)

year = 2010
texts_to_process = unique_keywords_2[year]
topics = kw_model_2.extract_keywords(texts_to_process)
KeyError                                  Traceback (most recent call last)
File [~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:759](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:759), in BaseModel.__getattr__(self, item)
    [758](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:758) try:
--> [759](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:759)     return pydantic_extra[item]
    [760](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:760) except KeyError as exc:

KeyError: 'message'

The above exception was the direct cause of the following exception:

AttributeError                            Traceback (most recent call last)
Cell In[10], [line 17](vscode-notebook-cell:?execution_count=10&line=17)
     [15](vscode-notebook-cell:?execution_count=10&line=15) year = 2010
     [16](vscode-notebook-cell:?execution_count=10&line=16) texts_to_process = unique_keywords_2[year]
---> [17](vscode-notebook-cell:?execution_count=10&line=17) topics = kw_model_2.extract_keywords(texts_to_process)
     [19](vscode-notebook-cell:?execution_count=10&line=19) print(topics)

File [~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:126](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:126), in KeyLLM.extract_keywords(self, docs, check_vocab, candidate_keywords, threshold, embeddings)
    [123](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:123)         keywords = [in_cluster_keywords[index] for index in range(len(docs))]
    [124](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:124) else:
    [125](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:125)     # Extract keywords using a Large Language Model (LLM)
--> [126](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:126)     keywords = self.llm.extract_keywords(docs, candidate_keywords)
    [128](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:128) # Only extract keywords that appear in the input document
    [129](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/_llm.py:129) if check_vocab:

File [~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:189](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:189), in OpenAI.extract_keywords(self, documents, candidate_keywords)
    [187](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:187)     else:
    [188](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:188)         response = self.client.completions.create(model=self.model, prompt=prompt, **self.generator_kwargs)
--> [189](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:189)     keywords = response.choices[0].message.content.strip()
    [190](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:190) keywords = [keyword.strip() for keyword in keywords.split(",")]
    [191](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/keybert/llm/_openai.py:191) all_keywords.append(keywords)

File [~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:761](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:761), in BaseModel.__getattr__(self, item)
    [759](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:759)         return pydantic_extra[item]
    [760](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:760)     except KeyError as exc:
--> [761](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:761)         raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}') from exc
    [762](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:762) else:
    [763](https://vscode-remote+wsl-002bubuntu-002d20-002e04.vscode-resource.vscode-cdn.net/home/fabmeyer/Dev/Python/Moritz_project/~/.local/share/virtualenvs/Moritz_project-3DWIh2uO/lib/python3.9/site-packages/pydantic/main.py:763)     if hasattr(self.__class__, item):

AttributeError: 'CompletionChoice' object has no attribute 'message'

fabmeyer avatar Feb 15 '24 09:02 fabmeyer

@fabmeyer I use the gpt3.5-turbo openai model and chat=True.

I assembled an example from the code I've used (disclaimer: I did not test it):

client = openai.OpenAI()
chatgpt = OpenAI(client, model="gpt-3.5-turbo", chat=True)
kw_model = KeyLLM(llm=chatgpt)
model = SentenceTransformer('all-MiniLM-L6-v2')

abstracts = [work['abstract'] if 'abstract' in work and work['abstract'] is not None else "" for work in
                 works]
embeddings_abstracts = model.encode(abstracts, convert_to_tensor=True)
keywords_abstracts = kw_model.extract_keywords(abstracts, embeddings=embeddings_abstracts, threshold=0.5)

lfoppiano avatar Feb 15 '24 10:02 lfoppiano

Ah right, I should definitely release an official version. Let me work on it for a bit and I'll let you know when I release 0.8.4.

MaartenGr avatar Feb 15 '24 10:02 MaartenGr

Apologies for the late delay (and thanks for the ping)! I just pushed 0.8.4 to PyPI, so all changes to the main branch should now be in the official release.

MaartenGr avatar Feb 15 '24 11:02 MaartenGr

0.8.4 not work... https://github.com/MaartenGr/KeyBERT/issues/187#issuecomment-1945760944 is work for me.

adlternative avatar Jul 14 '24 09:07 adlternative