llama_index icon indicating copy to clipboard operation
llama_index copied to clipboard

WebPageDemo.ipynb error

Open beddows opened this issue 2 years ago • 8 comments

In the file examples/data_connectors/WebPageDemo.ipynb, the "Using RssReader" demo throws the error: AttributeError: 'list' object has no attribute 'split'

Any thoughts on what this could be?

beddows avatar Jan 27 '23 16:01 beddows

Can you share the trace of the error message?

simonManydata avatar Jan 29 '23 19:01 simonManydata

@beddows is it fixed now? the rss reader has been updated

jerryjliu avatar Jan 29 '23 22:01 jerryjliu

@jerryjliu @simonManydata Still getting the error - trace attached

WebPageDemo_RssReader_attributeError.txt

beddows avatar Jan 30 '23 14:01 beddows

Apologies, I discovered an older version was installed. Latest version installed and API key verified. New error attached.

WebPageDemo_RssReader_attributeError-2.txt

beddows avatar Jan 30 '23 16:01 beddows

@bborn do you have an idea of what the issue is?

jerryjliu avatar Feb 06 '23 07:02 jerryjliu

@beddows what's the URL of the feed you're passing it. Looks like the feedparser isn't finding a content element.

bborn avatar Feb 06 '23 11:02 bborn

@bborn I'm using the provided links in the notebook, which seem to be valid:

documents = RssReader().load_data([ "https://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml", "https://roelofjanelsinga.com/atom.xml" ])

beddows avatar Feb 12 '23 16:02 beddows

@jerryjliu ok looks like the content attribute isn't there for some types of feeds, so we need to check for it in the Dict first. Sent a PR: https://github.com/jerryjliu/gpt_index/pull/435

bborn avatar Feb 12 '23 17:02 bborn

Closed with #435

jerryjliu avatar Feb 20 '23 19:02 jerryjliu