aws-genai-llm-chatbot icon indicating copy to clipboard operation
aws-genai-llm-chatbot copied to clipboard

feat: Added delete document functionality

Open azaylamba opened this issue 1 year ago • 1 comments

Issue #149 :

Description of changes: Added delete functionality for all types of documents (Files, Texts, Q&A and Websites). The feature deletes the documents from S3 upload bucket, S3 processed bucket, DynamoDB documents table, OpenSearch index and also updates DynamoDB workspaces table. Following are the major code changes:

  1. Added delete button on UI for each row of the documents.
  2. Added confirmation dialog via Modal so that user can Cancel/Delete the document from there.
  3. Created AWS step function to use State Machines and delete document workflow. This way, the whole process is organised and is automatically rolled back if any of the operation in the step function fails.

Major components and their working is as below:

  1. documents-tab.tsx has functionality related to delete button and handling of confirmation Modal.
  2. documents-client.ts has function deleteDocument to hit the backend API.
  3. delete_document function in lib/chatbot-api/functions/api-handler/routes/documents.py handles the API request
  4. deleteDocumentWorkflow is created in lib/rag-engines/workspaces/index.ts
  5. delete-document.ts has internal structure of Delete document workflow
  6. The lambda function to handle the workflow is written in lib/rag-engines/workspaces/functions/delete-document-workflow/delete/index.py
  7. The execution of state machine starts in delete_document function of lib/shared/layers/python-sdk/python/genai_core/documents.py
  8. The actual deletion of documents happens in delete_open_search_document function of lib/shared/layers/python-sdk/python/genai_core/opensearch/delete.py

Request flow would be like documents-client -> documents.py (api handler) -> documents.py (genai_core) -> index.py (delete-document-workflow) -> delete.py (genai_core/opensearch)

As part of this change, also updated version of opensearch-py which was initially updated as calling direct http methods was not allowed in earlier version but later on calling http methods was not required. Kept this change for future perspective as it would have no impact.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

azaylamba avatar Apr 20 '24 19:04 azaylamba

@massi-ang @bigadsoleiman Could you please review this?

azaylamba avatar Apr 24 '24 16:04 azaylamba

@bigadsoleiman @massi-ang would you be able to have a look at this PR?

azaylamba avatar May 26 '24 15:05 azaylamba

Thanks @azaylamba for this. Looking into it

bigadsoleiman avatar Jun 10 '24 12:06 bigadsoleiman

Good one and well detailed!

Tested with all engines and against an existing stack to ensure retro-compatibility with already uploaded docs. Happy to merge.

bigadsoleiman avatar Jun 10 '24 15:06 bigadsoleiman