BERTopic OpenAI representation fails to produce output when response content is None

OpenAI representation fails to produce output when response content is None

Open jeaninejuliettes opened this issue 4 months ago • 6 comments

Have you searched existing issues? 🔎

[X] I have searched and found no existing issues

Desribe the bug

I ran into issues when using the OpenAI representation as it sometimes produces a content of None, which then produced an error when trying to run: label = response.choices[0].message.content.strip().replace("topic: ", "")

Which makes sense, since the content is not a string. I'm unable to generate a minimal example since this is due to the output of OpenAI GPT.

I see two ways to work around this, but both have their own downsides/impact on the results, maybe anyone else sees better option:

set the content to type string before processing it any further. With the major downside that the label will then be set to the string 'None'
use a try and except to extract the content, strip this and replace the 'topic:' part of the string. If this fails the label is set to a fixed value like an empty string (and producing a warning that his has happened)

For now I fixed it by creating an inherited customOpenAI representation class within my script where I used the second option as a solution.

Reproduction

from bertopic import BERTopic

BERTopic Version

0.16.4

Oct 11 '24 06:10 jeaninejuliettes

BERTopic BERTopic copied to clipboard

OpenAI representation fails to produce output when response content is None

Have you searched existing issues? 🔎

Desribe the bug

Reproduction

BERTopic Version

BERTopic
BERTopic copied to clipboard