I can no longer upload files to vector store with AzureOpenAI
Confirm this is an issue with the Python library and not an underlying OpenAI API
- [X] This is an issue with the Python library
Describe the bug
Hi,
From 2 days till now i'm getting error when I try to upload files in vector stores using AzureOpenAI package. The same code works with OpenAI.
I changed nothing in my code but from 31/07/2024 it doesn't work with AzureOpenAI.
The output of file_batch:
File batch: FileCounts(cancelled=0, completed=0, failed=1, in_progress=0, total=1)
File batch status: failed
File status: failed
File last error: LastError(code='server_error', message='An internal error occurred.')
Are there some problems with AzureOpenAI ?
Thanks, Matteo
To Reproduce
Use a simple file.txt or other types.
Execute the code and see the result.
Code snippets
from openai import AzureOpenAI
client = AzureOpenAI(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version="2024-05-01-preview",
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
)
file_stream = open("path/of/my/simple/file.txt", "rb")
vector_store = client.beta.vector_stores.create(name="vs_test_assistant_v2")
vector_store_id = vector_store.id
print("Uploading file to vector store..")
file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
vector_store_id=vector_store_id,
files=[file_stream],
)
print(f"File batch status: {file_batch.status}")
print(f"File batch: {file_batch.file_counts}")
file = client.beta.vector_stores.files.list(vector_store_id).data[0]
print(f"File status: {file.status}")
if file.status == "failed":
print(f"File last error: {file.last_error}")
OS
Linux
Python version
Python v3.10.12
Library version
openai v1.37.2
cc @kristapratico
@matteopulega I'm not able to reproduce the error. Can you share which region your Azure OpenAI resource is in?
If this is still failing today, I recommend opening a support ticket against the service.
The region Is swedencentral.
same problem here
okey, after some tests, i can upload files when a vector store is empty. But i can not when there is at least one document in it. The status of the file stays on :'in_progress'
and using the UI (vector stores) of https://oai.azure.com/ also has some problems. I can't add / delete files sometimes
Now, however, I have a different problem:
after having created a vector store and 2 files for examples, when I try to execute client.beta.vector_stores.file_batches.create_and_poll(vector_store_id=vector_store_id,file_ids=file_ids) the process never ends.
when I try to execute client.beta.vector_stores.file_batches.create_and_poll(vector_store_id=vector_store_id,file_ids=file_ids) the process never ends.
are you on the latest version?
yes, in 1.48. Just to remember, i'm using AzureOpenAI with region swedencentral.
Now it works. it seems that sometimes vector stores using AzureOpenai stop to work correctly.
Same here - The file upload was working until Oct 9th last time i checked and now when i try to execute the below code, the process never ends.
open ai version = 1.51.2 (latest) previously 1.50.2
file_paths = ["./testFile.pdf"]
file_streams = [open(path, "rb") for path in file_paths]
#upload and poll the file
file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
vector_store_id=vector_store.id, files=file_streams
)
print (file_batch.status)
The same example which was working previously stopped working now. No changes were done in the code.
Same here again
UPDATE: So we had created a lot of vector stores, I removed a lot of them and now it's running smoothly again.
Yes same here, sometimes it just works directly but then all of a sudden the call CreateBatchFileJob stays in_progress:
private bool AddFilesToVectorStore(VectorStore vectorStore, List<AIFile> filesToAdd)
{
// Now add the files to the vector store
var vectorJob = _assistantVectorClient.CreateBatchFileJob(vectorStore.Id, filesToAdd.Select(f => f.FileId).ToList(), false);
// If the run is not successful, we will log it now
if (vectorJob.Status == VectorStoreBatchFileJobStatus.InProgress)
{
_logger.LogWarning("Vector job is still in progress, waiting for completion.");
WaitForVectorStoreJobCompletion(_assistantVectorClient, vectorJob.Value);
}
else if (vectorJob.Status != VectorStoreBatchFileJobStatus.Completed)
{
_logger.LogWarning($"Run failed with status: {vectorJob.Status}");
// We will cancel this job and retry it
_assistantVectorClient.CancelBatchFileJob(vectorStore.Id, vectorJob.Value.BatchId);
_logger.LogWarning("Vector job cancelled");
return false;
}
else
{
_logger.LogInformation("Vector job completed.");
}
var vectorJobStatus = _assistantVectorClient.GetBatchFileJob(vectorStore.Id, vectorJob.Value.BatchId);
_logger.LogInformation($"Files completed: {vectorJobStatus.Value?.FileCounts.Completed}");
return true;
}
Same here - The file upload was working until Oct 9th last time i checked and now when i try to execute the below code, the process never ends.
open ai version = 1.51.2 (latest) previously 1.50.2
file_paths = ["./testFile.pdf"] file_streams = [open(path, "rb") for path in file_paths] #upload and poll the file file_batch = client.beta.vector_stores.file_batches.upload_and_poll( vector_store_id=vector_store.id, files=file_streams ) print (file_batch.status)The same example which was working previously stopped working now. No changes were done in the code.
UPDATE: The upload is working smoothly again. Did not change anything anywhere. Seems this issue is intermittent and persists for a long time before going back to normal.
Hi. Using AzureOpenAI (client = AzureOpenAI())
I can query my vector store and I can upload files, but the files don't seem to actually associate with the vector store when I load the Azure AI Foundry/ check it in the assistant vector stores section.
I don't get an error, it's just that the files don't seem to be attached. Any idea if this is a bug?
This works:
`
Retrieve files from the vector store.
def get_vector_store_files(vector_store_id, limit=100): file_data = [] after = None
while True:
try:
logging.info(f"Attempting to fetch vector store files for {vector_store_id} with after={after}")
response = client.vector_stores.files.list(vector_store_id, limit=limit, after=after)
except Exception as e:
logging.error(f"Error fetching vector store files: {e}")
break
for file in response.data:
try:
file_detail = client.files.retrieve(file.id)
upload_time = datetime.fromtimestamp(file_detail.created_at)
vs_filename = file_detail.filename
file_key = vs_filename[-16:] # use last 16 characters for matching
file_data.append({
"VectorStoreFileName": vs_filename,
"FileKey": file_key,
"VectorStoreUpload": upload_time,
"VectorStoreFileID": file_detail.id
})
logging.info(f"Retrieved vector store file: {vs_filename}")
except Exception as e:
logging.error(f"Error retrieving details for file ID {file.id}: {e}")
if getattr(response, 'has_more', False):
after = response.data[-1].id
else:
break
return file_data`
this works too, but not putting the files on the VS itself.
`
def upload_file_to_vector_store(vector_store_id, file_path): try: with open(file_path, "rb") as f: # Use the file batch helper to upload and attach the file. file_batch = client.vector_stores.file_batches.upload_and_poll( vector_store_id=vector_store_id, files=[f] ) logging.info(f"File batch upload status: {file_batch.status}") logging.info(f"File batch counts: {file_batch.file_counts}") return file_batch except Exception as e: logging.error(f"Error uploading file '{file_path}' to vector store: {e}")
def fix_any_upload_issues(vector_store_id): """ Checks the vector store for files with a status of 'failed' and attempts to reattach them. Retries up to 5 times per file. """ try: files_page = client.vector_stores.files.list(vector_store_id) files = files_page.data except Exception as e: logging.error(f"Error listing files from vector store {vector_store_id}: {e}") return
# Filter files that have a 'failed' status.
failed_files = [f for f in files if getattr(f, "status", None) == "failed"]
logging.info(f"Initial failed files: {[f.id for f in failed_files]}")
for failed_file in failed_files:
attempt = 0
success = False
while attempt < 5 and not success:
attempt += 1
logging.info(f"Attempt {attempt} for file {failed_file.id}")
try:
# Attempt to reattach the failed file to the vector store.
# Note: Using the create() method here. Depending on your SDK version,
# you might need to call client.vector_stores.files.create(...)
client.vector_stores.files.create(
vector_store_id, file_id=failed_file.id
)
# After the attempt, re-read the file list and check if this file still fails.
updated_files_page = client.vector_stores.files.list(vector_store_id)
updated_files = updated_files_page.data
updated_failed_files = [
f for f in updated_files if getattr(f, "status", None) == "failed"
]
if not any(f.id == failed_file.id for f in updated_failed_files):
success = True
logging.info(f"Successfully reattached file {failed_file.id}")
else:
logging.info(f"File {failed_file.id} still in failed status after attempt {attempt}")
except Exception as error:
logging.error(
f"Failed to reattach file {failed_file.id} on attempt {attempt}: {error}"
)
logging.info("Finished processing failed files.")`
It’s April 2025, and I can confirm that this issue persists. I’ve implemented a workaround by detecting failures, removing the file reference from the vector store (not the actual file), and initiating a re-upload. This approach seems to work on the first retry.
I hope Microsoft and/or OpenAI will address this issue soon because this workaround is a hack and shouldn’t be the burden of us end-users.