feedparser icon indicating copy to clipboard operation
feedparser copied to clipboard

No more images after update 0.1.8 -> 0.1.11

Open BebeMischa opened this issue 2 years ago • 5 comments

Hello guys and girls,

I've just updated and now I have no more images in my feed. Something I need to change or is it a bug?

My feed sensor:

  - platform: feedparser
    name: Het nieuws
    feed_url: 'https://www.nu.nl/rss'
    date_format: '%a, %d %b %Y %H:%M:%S %Z'
    scan_interval:
      minutes: 1
    inclusions:
      - title
      - link
      - description
      - image
      - pubDate
    exclusions:
      - language

result:

afbeelding

Before this update it worked fine...

BebeMischa avatar Jul 03 '23 06:07 BebeMischa

For anybody facing the same issue, I changed the code in the sensor for feedparser. Now it should show both image en enclosure the right way! Maybe this can be added to the official release as well!

# Existing code

if "image" in self._inclusions and "image" not in entry_value.keys():
    images = []
    if "summary" in entry.keys():
        images = re.findall(r"<img.+?src=\"(.+?)\".+?>", entry["summary"])
    if images:
        entry_value["image"] = images[0]
    else:
        if "links" in entry.keys():
            images = re.findall(
                '(?:(?:https?|ftp):\/\/)?[\w/\-?=%.]+\.[\w/\-&?=%.]+', str(entry["links"][1])
            )
        if images:
            entry_value["image"] = images[0]
        else:
            entry_value[
                "image"
            ] = "https://www.home-assistant.io/images/favicon-192x192-full.png"
# Modified code

if "image" in self._inclusions and "image" not in entry_value.keys():
    images = []
    if "summary" in entry.keys():
        images = re.findall(r"<img.+?src=\"(.+?)\".+?>", entry["summary"])
    if images:
        entry_value["image"] = images[0]
    else:
        if "enclosures" in entry.keys() and entry["enclosures"]:
            enclosure_url = entry["enclosures"][0].get("url")
            if enclosure_url:
                entry_value["image"] = enclosure_url
        else:
            if "links" in entry.keys():
                images = re.findall(
                    '(?:(?:https?|ftp):\/\/)?[\w/\-?=%.]+\.[\w/\-&?=%.]+', str(entry["links"][1])
                )
            if images:
                entry_value["image"] = images[0]
            else:
                entry_value[
                    "image"
                ] = "https://www.home-assistant.io/images/favicon-192x192-full.png"

zboersen avatar Jul 07 '23 13:07 zboersen

I am looking into it in my free time. I also plan to add tests for the integration in #80 to ensure that future releases do not break stuff. If anyone finds a way to fix it, do not hesitate to submit a PR.

ogajduse avatar Jul 09 '23 14:07 ogajduse

Thanks, @zboersen , it did the trick ;-)

BebeMischa avatar Jul 13 '23 12:07 BebeMischa

I have the fix. I should publish a new feedparser version with the fix this weekend.

@BebeMischa @zboersen Could you please share the RSS feed URLs that you use and that contain images? That would help me in extending the test coverage.

ogajduse avatar Jul 26 '23 21:07 ogajduse

#81 should fix this issue. Could you please try the beta release I did and tell me if images show up for you? https://github.com/custom-components/feedparser/releases/tag/0.2.0b0

If they do not show up, could you please provide the feed URL, so I can investigate?

Note: #78, #57 and #64 should be addressed and fixed by #81.

ogajduse avatar Jul 28 '23 07:07 ogajduse