sample-app-aoai-chatGPT icon indicating copy to clipboard operation
sample-app-aoai-chatGPT copied to clipboard

Update data_utils.py

Open Knavneet opened this issue 1 year ago • 0 comments

included the edge case

Motivation and Context

The key change is in the extract_caption method. I've added a check to ensure that we only try to add the last line if the lines list is not empty.
  1. What problem does it solve? -- This modification should prevent the IndexError that I was encountering.
  2. What scenario does it contribute to? -it resolves the error when trying to access the last element of the lines list, but the list is empty.
  3. If it fixes an open issue, please link to the issue here.
  4. Does this solve an issue or add a feature that all users of this sample app can benefit from? Contributions will only be accepted that apply across all users of this app. --> yes, for sure, as smart chunking is being used by everyone developing RAG application

Description

I was encountering an IndexError in the extract_caption method of PdfTextSplitter class. The error occurs because code was trying to access the last element of the lines list, but the list is empty.

Updated PdfTextSplitter The key change is in the extract_caption method. I've added a check to ensure that we only try to add the last line if the lines list is not empty:

Contribution Checklist

  • [yes ] I have built and tested the code locally and in a deployed app
  • [ yes] For frontend changes, I have pulled the latest code from main, built the frontend, and committed all static files.
  • [ yes] This is a change for all users of this app. No code or asset is specific to my use case or my organization.
  • [ yes] I didn't break any existing functionality :smile:

Knavneet avatar Jul 22 '24 15:07 Knavneet