azure-search-openai-demo
azure-search-openai-demo copied to clipboard
How to add additional meta data to the index from json files
I have a data setup which contains the content insides pdfs and json files with addtional metadata (the jsons and pdfs share the same name). I want to use the json files as addtional meta data for the pdfs. My approach was to parse the json as additional fields to the index but did not succeed.
I looked into this https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/docs/customization.md#other-approaches-to-improve-search-results
and tried making the changes to the searchmanager.py
but could not manage to receive the result I wanted. I added additional 'SimpleFields'
but these were null after running'prepdocs.py'
https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/app/backend/prepdocslib/searchmanager.py#L106
The json is structured like this.
{
"title":"product",
"isRelevant":true,
"location":"en",
"remarks":"comments",
"thumbnailURL":"https://",
"fileName":"file.pdf"
}
Can someone help out on this to create the index correctly? Or suggest another approach that would add the json meta data to the pdf to improve the search?