facebook-post-scraper icon indicating copy to clipboard operation
facebook-post-scraper copied to clipboard

Can't log in because of cookies

Open pynomaly opened this issue 4 years ago • 9 comments

When running the script, I get:

Traceback (most recent call last):
  File "scraper.py", line 357, in <module>
    postBigDict = extract(page=args.page, numOfPost=args.len, infinite_scroll=infinite, scrape_comment=scrape_comment)
  File "scraper.py", line 258, in extract
    _login(browser, EMAIL, PASSWORD)
  File "scraper.py", line 201, in _login
    browser.find_element_by_id('loginbutton').click()
  File "~/anaconda3/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 360, in find_element_by_id
    return self.find_element(by=By.ID, value=id_)
  File "~/anaconda3/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 978, in find_element
    'value': value})['value']
  File "~/anaconda3/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "~/anaconda3/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="loginbutton"]"}
  (Session info: chrome=87.0.4280.88)

The browser shows the allow cookie window. Is there any solution?

pynomaly avatar Jan 04 '21 21:01 pynomaly

After changing the log in logic with the following code:

def _login(browser, email, password):
    browser.get("http://facebook.com")
    browser.maximize_window()
    browser.find_element_by_name("email").send_keys(email)
    browser.find_element_by_name("pass").send_keys(password)
    browser.find_element_by_id("u_0_h").click()
    browser.find_element_by_name("login").click()

I get this new error:

Traceback (most recent call last):
  File "scraper.py", line 405, in <module>
    scrape_comment=scrape_comment,
  File "scraper.py", line 279, in extract
    browser.get(page)
  File "/tmp/tmp.W3CmledTvJ/env/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
    self.execute(Command.GET, {'url': url})
  File "/tmp/tmp.W3CmledTvJ/env/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/tmp/tmp.W3CmledTvJ/env/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument
  (Session info: chrome=87.0.4280.66)

pynomaly avatar Jan 04 '21 21:01 pynomaly

Managed to make it login like this

def _login(browser, email, password):
    browser.get("http://facebook.com")
    browser.maximize_window()
    browser.find_element_by_id("u_0_h").click()
    time.sleep(3)
    browser.find_element_by_name("email").send_keys(email)
    browser.find_element_by_name("pass").send_keys(password)
    browser.find_element_by_name("login").click()
    time.sleep(5)

plknkl avatar Jan 19 '21 14:01 plknkl

The cookie-issue was solved for me by using a vpn with the US as location, since they dont have this request. Not the most beautiful solution but it worked.

simon-gross avatar Feb 16 '21 14:02 simon-gross

Here is what I did

Change x_path_text_cookies and x_path_text_login data to match your language (mine is for polish).

def _login(browser, email, password):
    browser.get("http://facebook.com")
    browser.maximize_window()
    browser.find_element_by_name("email").send_keys(email)
    browser.find_element_by_name("pass").send_keys(password)
    x_path_text_cookies = '//*[@title="Akceptuj wszystkie"]'
    x_path_text_login = '//*[@name="login"]'
    browser.find_element_by_xpath(x_path_text_cookies).click()
    browser.find_element_by_xpath(x_path_text_login).click()
    time.sleep(5)

SirCypkowskyy avatar Mar 31 '21 14:03 SirCypkowskyy

It should work now

SirCypkowskyy avatar Mar 31 '21 14:03 SirCypkowskyy

for me it worked substituting the _login with the following:

note that "consenti solo coockie essenziali" should be changed with "allow only essential cookies" for english versions.

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

def _login(browser, email, password):
    browser.get("http://facebook.com")
    browser.maximize_window()
    browser.implicitly_wait(5)
    WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[contains(string(), 'Consenti solo i cookie essenziali')]"))).click()
    time.sleep(5)
    browser.find_element(By.NAME, "email").send_keys(email)
    browser.find_element(By.NAME, "pass").send_keys(password)
    browser.find_element(By.NAME, "login").click()
    time.sleep(5)

ferrazzipietro avatar Sep 16 '22 12:09 ferrazzipietro

Sadly the elegant solution by @ferrazzipietro seems not to work.

DevTools listening on ws://127.0.0.1:50144/devtools/browser/248f4965-473a-42ee-a5e6-51dddec9dd2c
[24904:25920:1005/002847.731:ERROR:device_event_log_impl.cc(214)] [00:28:47.731] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
[24904:25920:1005/002847.733:ERROR:device_event_log_impl.cc(214)] [00:28:47.733] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\mikha\Downloads\chromedriver_win32\scraper.py", line 258, in extract
    option.add_experimental_option("prefs", {
  File "C:\Users\mikha\Downloads\chromedriver_win32\scraper.py", line 200, in _login
    def _login(browser, email, password):
AttributeError: 'WebDriver' object has no attribute 'find_element_by_name'
>>>

mikhail-poda avatar Oct 04 '22 22:10 mikhail-poda

@mikhail-poda seems like you are still using find_element_by_name(), that is no longer the choice for webdriver. As far as I know, you should use find_element() and then specify by what, as I did in the snippet I posted.

ferrazzipietro avatar Oct 05 '22 07:10 ferrazzipietro

Thank you @ferrazzipietro, it was my mistake - I had to close the py file in Notepad++ (saving the py file was not enough) so that the python runtime had the new py file version. After successful login and opening the group the chrome window disappears with the message

DevTools listening on ws://127.0.0.1:51236/devtools/browser/2f7e82af-6abc-4f01-8882-112db12f7ecc
[29572:8024:1005/205129.621:ERROR:device_event_log_impl.cc(214)] [20:51:29.621] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
[29572:8024:1005/205129.622:ERROR:device_event_log_impl.cc(214)] [20:51:29.623] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
[29572:28644:1005/205137.891:ERROR:registration_request.cc(266)] Registration response error message: PHONE_REGISTRATION_ERROR
[29572:28644:1005/205137.985:ERROR:mcs_client.cc(707)]   Error code: 500  Error message: Authentication Failed.
[29572:28644:1005/205137.985:ERROR:mcs_client.cc(709)] Failed to log in to GCM, resetting connection.
Number Of Scrolls Needed 2603

mikhail-poda avatar Oct 05 '22 18:10 mikhail-poda