rightmove_webscraper.py 2 Questions from Python / Panda Noob

2 Questions from Python / Panda Noob

Open haar-making opened this issue 3 years ago • 1 comments

Hi,

Apologies if the formatting is wrong - I'm new to GitHub but have tried to follow the guidelines

2 potentially stupid questions from a Python noob who's also trying to get to grips with Panda.

Question 1: Getting the floorplan from the tree uses the following:

xp_floorplan_url = """//*[@id="floorplanTabs"]/div[2]/div[2]/img/@src"""
floorplan_url = tree.xpath(xp_floorplan_url)

However, in the page source for a sample search (4 results to keep it small) and then the individual property page for one of the results there is no "floorplanTabs".

When I inspect the page in Chrome I can't find "floorplanTabs" either.

Can you explain how this works?

Question 2: What does this mean /div[2]/div[2] in the line below?

xp_floorplan_url = """//*[@id="floorplanTabs"]/div[2]/div[2]/img/@src"""

Many thanks for your help.

Sep 11 '22 20:09 haar-making

I think maybe Rightmove have changed the structure of their website since the script was written. You can find the current XPath of the floorplan image element using your web browser. The XPath expression: '/div[2]/div[2]' selects the second div child of its parent div, which also happens to be the second div child of its own parent. Look up XPath expressions if you're still confused, they're easy to learn

Sep 16 '22 17:09 sm17977

Duplicate

Apr 04 '23 00:04 toby-p

rightmove_webscraper.py rightmove_webscraper.py copied to clipboard

2 Questions from Python / Panda Noob

rightmove_webscraper.py
rightmove_webscraper.py copied to clipboard