rightmove_webscraper.py
rightmove_webscraper.py copied to clipboard
Scrape photo URLs
It'd be pretty handy to return a list of property photo URLs, or at least the primary/featured photo URL, e.g: https://media.rightmove.co.uk/64k/63334/85611534/63334_11482068_IMG_00_0000.jpeg
Agreed could be interesting. I don't have the bandwidth to look at this right now but if you want to submit a PR I could review and merge.
I'd be interested in this, I'll take a look at it at some point and see if I can submit a PR
I'm currently making a 2nd request, which obviously isn't ideal, but here's how I'm extracting the primary image via xpath. Hope it helps.
def _get_image(self, page_html):
image_url = page_html.html.xpath(
'//*[@id="root"]/main/div/article/meta', first=True
)
try:
return image_url.attrs["content"]
except AttributeError:
return "https://via.placeholder.com/450x300.png?text=No+Image!"
Thanks @monokal ! I ended up doing it slightly differently, here is the PR I submitted: https://github.com/toby-p/rightmove_webscraper.py/pull/43