fb_friend_list_scraper icon indicating copy to clipboard operation
fb_friend_list_scraper copied to clipboard

cli quits prematurely

Open sefabey opened this issue 2 years ago • 8 comments

Hi, after a lot of trials and errors, I finally got this working. It was not easy to get firefox playing nicely.

After I got this running, I wanted to get the friends from someone who has just under 55 friends which is available publicly when you are logged in to FB.

This is the script I ran:

fbfriendlistscraper -e [email protected] -p passpapsspass -u username -o username.txt

The CLI print statements showed that there were 2 pages to scrape with Scraping page 1 of 2. The first page returned info from 24 users, wrote to txt and and slept with total progress at 45%. The second page also did the same and slept at 91% total progress. At this point, the txt file had 48 users. After the second sleep, I got the below error and the script quit with only 48 users out of 55. So not sure why I am getting this error

[+] Scrolling to the bottom
[+] Removing scraped elements from page
[+] Cleaning up leftover elements
Total progress ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━  91% -:--:--
Traceback (most recent call last):
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/bin/fbfriendlistscraper", line 8, in <module>
    sys.exit(main())
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/fb_friend_list_scraper/scraper.py", line 363, in main
    do_scrape(driver, email, password, user_to_scrape, outfile_path, args)
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/fb_friend_list_scraper/scraper.py", line 302, in do_scrape
    cleanup(driver, progress)
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/fb_friend_list_scraper/scraper.py", line 224, in cleanup
    element = driver.find_element(
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 1248, in find_element
    return self.execute(Command.FIND_ELEMENT, {
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 425, in execute
    self.error_handler.check_response(response)
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 247, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: /html/body/div[1]/div/div[4]/div/div[1]/div[1]/div[1]
Stacktrace:
WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:183:5
NoSuchElementError@chrome://remote/content/shared/webdriver/Errors.jsm:395:5

sefabey avatar Jun 09 '22 22:06 sefabey

Thanks for opening an issue, I will look into this ASAP!

n0kovo avatar Jun 11 '22 05:06 n0kovo

To add, I had another go for roughly 850 friends. Got the below error at 22 out of 24 pages. Total progress printout was 62% and I actually got urls for 729 friends so there is a potential mismatch (was expecting 85%).

  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/bin/fbfriendlistscraper", line 8, in <module>
    sys.exit(main())
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/fb_friend_list_scraper/scraper.py", line 363, in main
    do_scrape(driver, email, password, user_to_scrape, outfile_path, args)
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/fb_friend_list_scraper/scraper.py", line 302, in do_scrape
    cleanup(driver, progress)
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/fb_friend_list_scraper/scraper.py", line 224, in cleanup
    element = driver.find_element(
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 1248, in find_element
    return self.execute(Command.FIND_ELEMENT, {
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 425, in execute
    self.error_handler.check_response(response)
  File "/home/datalab3/miniconda3/envs/fb_friend_scrape/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 247, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: /html/body/div[1]/div/div[4]/div/div[1]/div[1]/div[1]
Stacktrace:
WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:183:5
NoSuchElementError@chrome://remote/content/shared/webdriver/Errors.jsm:395:5
element.find/</<@chrome://remote/content/marionette/element.js:300:16```

sefabey avatar Jun 11 '22 08:06 sefabey

It seems Facebook might have changed some markup. I will try and get this fixed in the coming days.

n0kovo avatar Jun 11 '22 12:06 n0kovo

hi ,thanks for this great learning tools you provided. i also get this premature error, only scraped around 25 friends at first part of 1/9 page. then will get this error below :

Traceback (most recent call last): File "C:\Users\my username\Desktop\fb_friend_list_scraper-develop - 2\fb_friend_list_scraper-develop\fb_friend_list_scraper\scraper.py", line 367, in main() File "C:\Users\my username\Desktop\fb_friend_list_scraper-develop - 2\fb_friend_list_scraper-develop\fb_friend_list_scraper\scraper.py", line 363, in main do_scrape(driver, email, password, user_to_scrape, outfile_path, args) File "C:\Users\my username\Desktop\fb_friend_list_scraper-develop - 2\fb_friend_list_scraper-develop\fb_friend_list_scraper\scraper.py", line 302, in do_scrape cleanup(driver, progress) File "C:\Users\my username\Desktop\fb_friend_list_scraper-develop - 2\fb_friend_list_scraper-develop\fb_friend_list_scraper\scraper.py", line 224, in cleanup element = driver.find_element( File "C:\Users\my username\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 1251, in find_element return self.execute(Command.FIND_ELEMENT, { File "C:\Users\my username\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 430, in execute self.error_handler.check_response(response) File "C:\Users\my username\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: /html/body/div[1]/div/div[4]/div/div[1]/div[1]/div[1] Stacktrace: WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:183:5 NoSuchElementError@chrome://remote/content/shared/webdriver/Errors.jsm:395:5 element.find/</<@chrome://remote/content/marionette/element.js:300:16

mrnoobnoobies avatar Jun 18 '22 08:06 mrnoobnoobies

then after several try and error, read many S.O forum related issue , i edited a part of the script that writes def scroll_down as below :

def scroll_down(driver, progress): logprint("Scrolling to the bottom") driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") clicks = driver.find_elements(By.XPATH, '//*[@id="m_more_friends"]') for click in clicks: click = click.click() time.sleep(15)

also edited xpath under def cleanup as follow :

element = driver.find_element( By.XPATH, '//div[contains(concat(" ",normalize-space(@class)," ")," timeline ")]/div/div[contains(concat(" ",normalize-space(@class)," ")," _2pit ")]/div'

But still cant get to fully finished the script(only passed 3/9 page) until i get below error :

Traceback (most recent call last): File "C:\Users\my username\Desktop\fb_friend_list_scraper-develop - 1\fb_friend_list_scraper-develop\fb_friend_list_scraper\scraper.py", line 401, in main() File "C:\Users\my username\Desktop\fb_friend_list_scraper-develop - 1\fb_friend_list_scraper-develop\fb_friend_list_scraper\scraper.py", line 397, in main do_scrape(driver, email, password, user_to_scrape, outfile_path, args) File "C:\Users\my username\Desktop\fb_friend_list_scraper-develop - 1\fb_friend_list_scraper-develop\fb_friend_list_scraper\scraper.py", line 323, in do_scrape scrape_profiles(driver, outfile_path, progress, args) File "C:\Users\my username\Desktop\fb_friend_list_scraper-develop - 1\fb_friend_list_scraper-develop\fb_friend_list_scraper\scraper.py", line 175, in scrape_profiles username = div.find("a")["href"][1:].replace("profile.php?id=", "") File "C:\Users\my username\AppData\Roaming\Python\Python310\site-packages\bs4\element.py", line 1519, in getitem return self.attrs[key] KeyError: 'href'

mrnoobnoobies avatar Jun 18 '22 08:06 mrnoobnoobies

Hi @mohdradhi84, thanks for your report! I haven't had the time to get proper look at it yet. I'm considering making it more clear in the code exactly what elements the script is looking for on the page, to make it easier for people to debug when Facebook makes changes. I'll probably get around to it in the next couple of days. I'm pretty sure both of your errors stem from the page not looking like the script expects it to. I'll keep you posted!

n0kovo avatar Jun 18 '22 13:06 n0kovo

Nice Sir, it will really be useful thank you so much for your great effort.

On Saturday, June 18, 2022, narkopolo @.***> wrote:

Hi @mohdradhi84 https://github.com/mohdradhi84, thanks for your report! I haven't had the time to get proper look at it yet. I'm considering making it more clear in the code exactly what elements the script is looking for on the page, to make it easier for people to debug when Facebook makes changes. I'll probably get around to it in the next couple of days. I'm pretty sure both of your errors stem from the page not looking like the script expects it to. I'll keep you posted!

— Reply to this email directly, view it on GitHub https://github.com/narkopolo/fb_friend_list_scraper/issues/2#issuecomment-1159461757, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK3ZBHAPHEZMRK55Y45Q4BTVPXCWRANCNFSM5YLT33IA . You are receiving this because you were mentioned.Message ID: @.***>

mrnoobnoobies avatar Jun 20 '22 05:06 mrnoobnoobies

thanks Sir @narkopolo , i have made little modification to your existing codes and use mbasic.facebook.com instead of m.facebook.com .

now its working fine, many thanks to your magnificent work Sir👍!

mrnoobnoobies avatar Aug 03 '22 04:08 mrnoobnoobies