FileSearchToolCall.file_search has empty results
Confirm this is an issue with the Python library and not an underlying OpenAI API
- [ ] This is an issue with the Python library
Describe the bug
Continuing from issue : #1938
The error seems to be fixed with openai==1.58.1 (it does not return a 400 error anymore). However if we capture the output of the stream with a custom class inheriting from AssistantEventHandler the results of the fileSearch tool are not available:
@override
def on_tool_call_done(self, tool_call: ToolCall):
print(tool_call)
Of which the results are:
FileSearchToolCall(id='call_ID', file_search=FileSearch(ranking_options=FileSearchRankingOptions(ranker='default_2024_08_21', score_threshold=0.0), results=[]), type='file_search', index=0)
where following from openai.types.beta.threads.runs.file_search_tool_call.py it supposed to show:
class FileSearch(BaseModel):
ranking_options: Optional[FileSearchRankingOptions] = None
"""The ranking options for the file search."""
results: Optional[List[FileSearchResult]] = None
"""The results of the file search."""
when creating the run as follows:
with client.beta.threads.runs.stream(
thread_id=thread.id,
assistant_id=ass_id,
event_handler=CustomEventHandler(),
include=["step_details.tool_calls[*].file_search.results[*].content"]
) as stream:
# Wait for the stream to complete
stream.until_done()
To Reproduce
- Run this with the id of an assistant connected to a vector store and the file search enabled (for simplicity do such thing through platform.openai.com)
from typing import override
from openai import AssistantEventHandler
from openai import OpenAI
from openai.types.beta.threads.runs.tool_call import ToolCall
client = OpenAI()
messages = [
{
"content": <QUESTION_TO_THE_ASSISTANT>,
}
]
# Create a new thread for the assistant
thread = client.beta.threads.create()
client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content=messages[-1]["content"]
)
class CustomEventHandler(AssistantEventHandler):
@override
def on_tool_call_done(self, tool_call: ToolCall):
print(tool_call)
# Stream the assistant's response
with client.beta.threads.runs.stream(
thread_id=thread.id,
assistant_id=<ASSISTANT_ID>,
event_handler=CustomEventHandler(),
include=["step_details.tool_calls[*].file_search.results[*].content"]
) as stream:
# Wait for the stream to complete
stream.until_done()
Code snippets
No response
OS
Windows
Python version
Python 3.11.10
Library version
openai 1.58.1
Alright, let's dig into this. So, the 400 error from issue #1938 is gone in openai==1.58.1, but now the FileSearch results are empty when using a custom AssistantEventHandler. That's a sneaky bug.
Here's the breakdown and how we can tackle this:
Understanding the Issue
The on_tool_call_done method in your CustomEventHandler is supposed to receive the results of the fileSearch tool call. However, the results list in the FileSearch object is empty, even though you've explicitly included step_details.tool_calls[].file_search.results[].content in the include parameter of the stream method. This suggests that either the results are not being populated correctly or there's an issue with how the include parameter is being handled in the stream method. Possible Causes
Bug in openai==1.58.1: There might be a bug in the library that prevents the FileSearch results from being populated when using a custom event handler. Incorrect usage of include parameter: The include parameter might not be working as expected, or there might be a different way to include the FileSearch results when using a custom event handler. Issue with the Assistant or Vector Store: There might be a configuration issue with the Assistant or the connected vector store that prevents the fileSearch from returning results. Debugging Steps
Verify Assistant and Vector Store: Double-check that the Assistant is correctly configured to use the fileSearch tool and that the vector store is properly connected and populated with data.
Test with the Default Event Handler: Try running the code with the default AssistantEventHandler (or without specifying an event handler) to see if the FileSearch results are populated correctly in that case. This will help isolate whether the issue is specifically with the custom event handler.
Inspect the Raw Response: If possible, capture the raw HTTP response from the stream method and examine its contents. This might reveal clues about why the FileSearch results are missing or if there are any error messages in the response.
Simplify the Code: Try removing the include parameter or simplifying the custom event handler to see if that affects the results. This can help pinpoint whether the issue is related to the include parameter or the custom event handler's logic.
Check for Updates: Ensure you're using the latest version of the openai library. If a newer version is available, try upgrading to see if it resolves the issue.
Report to OpenAI: If you're unable to identify the cause of the issue, report it to OpenAI with a detailed description, code snippet, and steps to reproduce. They might be able to provide insights or identify a bug in the library.
By systematically investigating these points, we should be able to pinpoint the cause of the missing FileSearch results and get this functionality working as expected.
First, they're using an older version of the openai library (1.58.1). Might be worth bumping that up to the latest, see if it makes a difference. Sometimes those sneaky bugs get squashed in newer releases.
Second, that include parameter... it's a bit verbose. Maybe there's a simpler way to specify those FileSearch results? Worth checking the docs, see if there's a more concise syntax.
And lastly, this whole AssistantEventHandler thing... it's a bit of a black box. We don't know exactly how it's interacting with the stream or processing the results. Might be worth digging into the source code, see if there are any clues there.
Overall, feels like a classic case of "it's not you, it's me" (or rather, it's the library). But with a bit of digging and some creative debugging, we should be able to crack this nut.
It seems it has been chosen not to retrieve (or show) the results as we can see from the comments on this class:
class FileSearchToolCall(BaseModel):
id: str
"""The ID of the tool call object."""
file_search: FileSearch
"""For now, this is always going to be an empty object."""
type: Literal["file_search"]
"""The type of tool call.
This is always going to be `file_search` for this type of tool call.
"""
I'm getting this error today on the latest API version. I'm guessing Assistants API has been abandoned in favor of Responses API which is working.
response = await llm.responses.create(
model="gpt-4o-mini",
input="What is prompt engineering?",
tools=[{
"type": "file_search",
"vector_store_ids": ["vs_J..."],
"max_num_results": 2,
}],
include=["file_search_call.results"]
)
print(response.model_dump_json(indent=2))
Thanks for reporting!
This sounds like an issue with the underlying OpenAI API and not the SDK, so I'm going to go ahead and close this issue.
Would you mind reposting at community.openai.com?