feedparser icon indicating copy to clipboard operation
feedparser copied to clipboard

Incorrect description when subtitle present in feed

Open mmcdole opened this issue 10 years ago • 2 comments

Take the following podcast rss: http://feeds.serialpodcast.org/serialpodcast

It has both an itunes:subtitle and a description for the feed element. FeedParser only ever returns the itunes:subtitle, even when attempting to access feed.description.

I think that feed.description should have channel/description take precedent over the itunes subtitle. We have a separate feed.subtitle already.

mmcdole avatar Sep 13 '15 16:09 mmcdole

I believe the issue is this in util.py:

https://github.com/kurtmckee/feedparser/blob/8c6294042b749544ad87f73cf55da143bf55e921/feedparser/util.py#L91

    if isinstance(realkey, list):
        for k in realkey:
            if dict.__contains__(self, k):
                return dict.__getitem__(self, k)
    elif dict.__contains__(self, realkey):
        return dict.__getitem__(self, realkey)

What this is doing, is checking this keymap when accessing an element. The problem is, it is looking at alternative names before looking at the name given to the FeedParserDict.

 keymap = {'channel': 'feed',
              'items': 'entries',
              'guid': 'id',
              'date': 'updated',
              'date_parsed': 'updated_parsed',
              'description': ['summary', 'subtitle'],
              'description_detail': ['summary_detail', 'subtitle_detail'],
              'url': ['href'],
              'modified': 'updated',
              'modified_parsed': 'updated_parsed',
              'issued': 'published',
              'issued_parsed': 'published_parsed',
              'copyright': 'rights',
              'copyright_detail': 'rights_detail',
              'tagline': 'subtitle',
              'tagline_detail': 'subtitle_detail'}

So in this case, I asked for feed['description'] and it went and looked up "description" in this dictionary, saw that it had a list of alternative names of summary and subtitle. Looked up both of those, and found a subtitle, so it ended up returning itunes:subtitle when I really asked for description and the feed had a description.

That if-statement I posted above should be flipped so it first looks for the key I give it, and only if it doesn't find it does it go and look for alternative names.

mmcdole avatar Sep 14 '15 12:09 mmcdole

Actually that is not the case.If you see the code snippet below, it maps the description to subtitle.So if the order is like <description> and then <itunes:subtitle>,it override the previous description entry which was added in subtitle

 context = self._getContext()
            if element == 'description':
                element = 'subtitle'
            context[element] = output
            if element == 'link':
                # fix query variables; see above for the explanation
                output = re.sub("&([A-Za-z0-9_]+);", "&\g<1>", output)
                context[element] = output
                context['links'][-1]['href'] = output
            elif self.incontent:
                contentparams = copy.deepcopy(self.contentparams)
                contentparams['value'] = output
                context[element + '_detail'] = contentparams

FarmaanElahi avatar Apr 02 '18 09:04 FarmaanElahi