feedparser icon indicating copy to clipboard operation
feedparser copied to clipboard

media:description element overwrites 'content' field

Open akuchling opened this issue 9 years ago • 1 comments

The handling of the media:description element in 5.2.1 ends up overwriting the 'content' field of an item. This seems like a particular case of issue #35.

An example feed item and test script are attached. 'description' and 'summary' of the single entry in the feed are set to the full story text (starting "Just like you sync your tablet ..."), which is some 4400-odd bytes. But 'content' is set to the 101-byte caption of the photo (starting "Bonnie Plants’ Homegrown free app keeps you growing in the garden.").

One possible fix is to make _start_media_description()/_end_media_description() could be their own methods instead of an alias for _start_description(), and it could do something like what _start_media_license()/_end_media_license() do. Or maybe _start_description() needs to be more complicated and do something different when in a media:content context.

I'm happy to work on a patch, if given a direction to pursue.

media-desc-issue.zip

akuchling avatar Feb 24 '16 18:02 akuchling

FYI, with the current 'develop' branch (commit ID f019d0673c60828faef691803ff74aad3e058f41), the 'content' attribute still contains the caption, but it's now set to a dictionary instead of just a string:

content {u'base': u'', u'type': u'text/plain', u'value': u'Bonnie Plants\u2019 Homegrown free app keeps you growing in the garden. (Photo courtesy Bonnie Plants/TNS)', u'language': None} 101

akuchling avatar Feb 24 '16 19:02 akuchling