WWWE
WWWE copied to clipboard
potential fix for google_search()
A suggested fix for google_search()
I tried the pull-request and it doesn't seem to work for me. Every email I tested gets reported as not found in the google search results which is not the case!
The major change is that the function now performs a google search using quotes, e.g “[email protected]”. It will search for that email exactly as typed. It works for me! Any public facing email addresses return results, whilst private emails don’t. If that’s not exactly the intended function, my apologies.
The function does pretty match what you described but when I test the following script using your pull-request:
import os, sys
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
os.environ['MOZ_HEADLESS'] = '1'
cap = DesiredCapabilities().FIREFOX
cap["marionette"] = True
def google_search(email):
endpoint = 'https://google.com/search?q=%22{}%22'.format(email)
try:
with webdriver.Firefox(capabilities=cap) as d:
d.get(endpoint)
if "No results found" or "did not match any documents" in d.page_source:
return False
else:
return True
except Exception as error:
raise(error)
try:
email = sys.argv[1]
breached = google_search(email)
if breached:
print("{} shows up on google search results".format(email))
else:
print("{} doesn't show up on google search results.".format(email))
except IndexError:
sys.exit(0)
I get positive (by positive I mean not showing up in google search results) results for every email I test. When I use your method manually it works but through that script it doesn't for some reason. I've tried it even for very simple emails that have been in thousands breaches and it keeps reporting them as safe...
The patch ignores a race condition. Google's search is rendered via Javascript and the script does not make sure that it waits for the DOM to have been assembled before trying to read from it.
cf. https://selenium-python.readthedocs.io/waits.html
Aha! Excellent catch. Thank you! I was stumped. I couldn’t recreate the issue on my end with my set of test emails. An explicit wait should resolve this issue.