Data source information is lost for a parent URL upon providing child URL
🐛 Describe the bug
I'm using the Embedchain Demo in a Colab notebook, for prototyping against our spirits-related data set. I'm encountering an issue where the information on a parent product page is being "forgotten" by the bot, after adding child pages of the parent. In this case, I'm testing against our site product URLs.
First I add our parent product page for information on a spirit:
distiller_bot = App()
distiller_bot.add("https://distiller.com/spirits/blanton-s-original-single-barrel")
Upon querying for information on the spirit, I receive an accurate response:
distiller_bot.query("What can you tell me about Blanton's Original Single Barrel, and what do the reviews say?")
Output:
Blanton's Original Single Barrel is a bourbon whiskey that was first produced in 1984 by Master Distiller Elmer T Lee. It is named after Colonel Albert Blanton, who had a long career in the whiskey industry. Blanton's is known for being the first mass-produced single-barrel bourbon. As for the reviews, the community rating for Blanton's Original Single Barrel is 4.29 out of 5 stars, based on a total review count of 11,882. Unfortunately, the specific details of the reviews are not provided, so we cannot provide any further information about what the reviews say.
This all checks out; the parent page has these details and the ratings overviews.
Then, I add the child page for user reviews to the data sources:
distiller_bot.add("https://distiller.com/spirits/blanton-s-original-single-barrel/tastes?require_note=true")
Upon making the same query:
distiller_bot.query("What can you tell me about Blanton's Original Single Barrel, and what do the reviews say?")
I get the following Output:
Based on the given context, Blanton's Original Single Barrel is a type of whiskey. The reviews mention that it has a sweet taste with a light nose and a fast finish. The flavors of cinnamon and vanilla are also mentioned. The price is stated as 13.0 USD per pour. However, there is no additional details about Blanton's Original Single Barrel.
1 ) "However, there is no additional details about Blanton's Original Single Barrel" is consistent for each re-test on above scenario.
2 ) Changing the order of the data source additions does not matter.
3 ) Does not seem to be related to config settings (.yaml)
app:
config:
id: 'distiller-bot'
llm:
provider: openai
config:
model: 'gpt-3.5-turbo'
temperature: 0.5
max_tokens: 1000
top_p: 1
stream: false
4 ) I've tried multiple iterations and samples for URLs, etc.
Is there something I'm not considering?