sample-app-aoai-chatGPT icon indicating copy to clipboard operation
sample-app-aoai-chatGPT copied to clipboard

How to delete a document from Search index? Any suggestions how to do it

Open rehat22 opened this issue 10 months ago • 2 comments

I used the following script to delete a dcument from azure ai search:

import requests service_name = "" index_name = "" api_version = "2023-07-01-preview" admin_api_key = "" document_key = ""

url = f"https://{service_name}.search.windows.net/indexes/{index_name}/docs/index?api-version={api_version}"

data = { "value": [{ "id": document_key, "@search.action": "delete" }] }

headers = { "Content-type": "application/json", "api-key": admin_api_key }

response = requests.post(url, headers=headers, json=data)

if response.status_code == 200 or response.status_code == 204: print("Document deleted successfully!") else: print(f"Error deleting document: {response.status_code} {response.text}").

I got a success message on running it but the document is still there in the index. Is there an another way to approach this?

rehat22 avatar Apr 18 '24 12:04 rehat22

The chat interaction uses the chunk index as well, so you'd have to clear the chunks to get the data out of the index. I'm not sure if there is a way to identify which chunks are associated with a specific file so I delete all of the chunk and reset the indexers so they create new chunks and index them.

guyyardeni avatar Apr 18 '24 20:04 guyyardeni

thank you @guyyardeni

rehat22 avatar Apr 23 '24 10:04 rehat22

The chat interaction uses the chunk index as well, so you'd have to clear the chunks to get the data out of the index. I'm not sure if there is a way to identify which chunks are associated with a specific file so I delete all of the chunk and reset the indexers so they create new chunks and index them.

You should be able to use the parent_id field located in the chunk index. This is the id of the original document. All chunks derived from a given a document will have the same value for this field.

jack-vinitsky avatar Jun 10 '24 15:06 jack-vinitsky

The chat interaction uses the chunk index as well, so you'd have to clear the chunks to get the data out of the index. I'm not sure if there is a way to identify which chunks are associated with a specific file so I delete all of the chunk and reset the indexers so they create new chunks and index them.

You should be able to use the parent_id field located in the chunk index. This is the id of the original document. All chunks derived from a given a document will have the same value for this field.

@rehat22 If this answer resolves your issue, please mark it as closed.

jack-vinitsky avatar Jun 21 '24 13:06 jack-vinitsky

Hi, You can do it. Thank you

rehat22 avatar Jun 21 '24 15:06 rehat22

Hi, You can do it. Thank you

I think only the person who opened the issue or an admin can do it.

jack-vinitsky avatar Jun 25 '24 00:06 jack-vinitsky