feedparser
feedparser copied to clipboard
Parse feeds in Python
I noticed that the latest released version of feedparser crashes, when a CDATA section contains a C Code snippets. Here is an example on how to reproduce the issue. -...
I have encountered an issue when using the feedparser library to parse RSS directly from a URL. For example: ``` >>> feedparser.parse('https://hackernewsrss.com/feed.xml').keys() dict_keys(['bozo', 'entries', 'feed', 'headers', 'bozo_exception']) >>> d =...
I use this indirectly via Feediverse (an RSS to Fediverse poster), and I just had that post an entire feed’s worth of posts again. What I think happened here is...
I am looking to pull the entirety of an articles content so that I can summarize it using ChatGPT. However, for every entry it appears that content is populated with...
we have a couple of rss sources, such as https://economictimes.indiatimes.com/tech/rssfeeds/13357270.cms https://www.cbsnews.com/latest/rss/us these accounts have `image` tag under item. ``` some title some descr https://somelink.com https://somelink.com/1.jpg 100958160 2023-06-13T12:32:16+05:30 ``` I'm aware...
Given the following example atom file: ``` Lorem Ipsum Dolor Sit Amet 2013-01-08T12:04:00.000Z Lorem Ipsum Dolor Sit Amet placeholder, Placeholder, Placeholder Placeholder, Placeholder Placeholder Lorem ipsum dolor sit amet, consectetur...
I just spent a while trying to figure out why this code didn't work as expected: ```python field = 'updated' # `data` is a FeedParserDict retrieved from `parsed_feed.feed`... if field...
For example, there is a popular podcast website Buzzsprout that uses Atom for its feeds. Each `item` has no `link` but only an `enclosure`. Here's [an example feed](https://feeds.buzzsprout.com/99850.rss): ```xml ......
Exceptions raised within feedparser are currently caught internally and an empty/half-parsed result is returned with the exception in the bozo_exception attribute. To be honest: I personally think this is not...
Normally I find that an entry's [`enclosures` property](https://feedparser.readthedocs.io/en/latest/reference-entry-enclosures.html) always exists, but is an empty list if the entry has no enclosures. However, if an entry has an `id`, but no...