googlesearch
googlesearch copied to clipboard
Infinite loop fetching when using `search` function
It appears the search function is broken, and calls to the search function get stuck in an infinite loop.
You can reproduce this easily with a simple script like this one:
from googlesearch import search
import logging
logging.basicConfig(level=logging.DEBUG)
print("Starting search...")
res = search("nhl bowen byram")
print("Finished search.")
list_of_urls = [x for x in res]
print(list_of_urls)
Also tried to just convert the generator to a list with the same outcome:
from googlesearch import search
import logging
logging.basicConfig(level=logging.DEBUG)
print("Starting search...")
res = search("nhl bowen byram")
print(list(res))
print("Finished search.")
The output of the following:
Hello World
finished
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.google.com:443
DEBUG:urllib3.connectionpool:https://www.google.com:443 "GET /search?q=nhl%2Bbowen%2Bbyram&num=12&hl=en&start=0 HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.google.com:443
DEBUG:urllib3.connectionpool:https://www.google.com:443 "GET /search?q=nhl%2Bbowen%2Bbyram&num=12&hl=en&start=0 HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.google.com:443
DEBUG:urllib3.connectionpool:https://www.google.com:443 "GET /search?q=nhl%2Bbowen%2Bbyram&num=12&hl=en&start=0 HTTP/1.1" 200 None
To debug this further, I put trace statements in the package, and it looks like start and num_results are never updated:
# Fetch
start = 0
while start < num_results:
print(start, num_results)
# Send request
resp = _req(escaped_term, num_results - start,
lang, start, proxies, timeout)
Result:
01/30/2024 08:31:43 AM Results from Google: <generator object search at 0x122abcb30>
0 10
01/30/2024 08:31:43 AM Starting new HTTPS connection (1): www.google.com:443
01/30/2024 08:31:43 AM https://www.google.com:443 "GET /search?q=nhl%2Bbowen%2Bbyram&num=12&hl=en&start=0 HTTP/1.1" 200 None
0 10
01/30/2024 08:31:44 AM Starting new HTTPS connection (1): www.google.com:443
01/30/2024 08:31:44 AM https://www.google.com:443 "GET /search?q=nhl%2Bbowen%2Bbyram&num=12&hl=en&start=0 HTTP/1.1" 200 None
0 10
01/30/2024 08:31:45 AM Starting new HTTPS connection (1): www.google.com:443
01/30/2024 08:31:45 AM https://www.google.com:443 "GET /search?q=nhl%2Bbowen%2Bbyram&num=12&hl=en&start=0 HTTP/1.1" 200 None
0 10
01/30/2024 08:31:45 AM Starting new HTTPS connection (1): www.google.com:443
01/30/2024 08:31:45 AM https://www.google.com:443 "GET /search?q=nhl%2Bbowen%2Bbyram&num=12&hl=en&start=0 HTTP/1.1" 200 None
0 10
01/30/2024 08:31:45 AM Starting new HTTPS connection (1): www.google.com:443
01/30/2024 08:31:45 AM https://www.google.com:443 "GET /search?q=nhl%2Bbowen%2Bbyram&num=12&hl=en&start=0 HTTP/1.1" 200 None
I'm facing the same problem. Did you found a workaround?
Having the same problem