get_reviews error: google sorry (302 Moved)
Describe the bug Rate limiting or other unknown behavior causes Google to flag host IP address causing get_reviews to fail. The specific error is:
Traceback (most recent call last):
File "/home/<redacted>/<redacted>/.venv/bin/ghunt", line 8, in <module>
sys.exit(main())
File "/home/<redacted>/<redacted>/.venv/lib/python3.10/site-packages/ghunt/ghunt.py", line 18, in main
parse_and_run()
File "/home/<redacted>/<redacted>/.venv/lib/python3.10/site-packages/ghunt/cli.py", line 55, in parse_and_run
process_args(args)
File "/home/<redacted>/<redacted>/.venv/lib/python3.10/site-packages/ghunt/cli.py", line 65, in process_args
asyncio.run(email.hunt(None, args.email_address, args.json))
File "/home/<redacted>/.pyenv/versions/3.10.14/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/<redacted>/.pyenv/versions/3.10.14/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/home/<redacted>/<redacted>/.venv/lib/python3.10/site-packages/ghunt/modules/email.py", line 117, in hunt
err, stats, reviews, photos = await gmaps.get_reviews(as_client, target.personId)
File "/home/<redacted>/<redacted>/.venv/lib/python3.10/site-packages/ghunt/helpers/gmaps.py", line 57, in get_reviews
data = json.loads(req.text[5:])
File "/home/<redacted>/.pyenv/versions/3.10.14/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/home/<redacted>/.pyenv/versions/3.10.14/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/<redacted>/.pyenv/versions/3.10.14/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
To Reproduce Since I do not know precisely the behavior which caused Google to flag the host IP, I do not know exactly how to recreate. One could try creating a list of known good gmail accounts and then iterate through them rapidly to generate suspicious traffic.
Expected behavior The expected behavior is to fetch google reviews and generate maps data; however, the above message abruptly terminates the program.
Screenshots Below is a print of req.text from helpers/gmaps.py -> get_reviews (line 56)
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://www.google.com/sorry/index?continue=https://www.google.com/locationhistory/preview/mas%3Fauthuser%3D0%26hl%3Den%26gl%3Dus%26pb%3D!1s109750315187539724244!2m3!1sYE3rYc2rEsqOlwSHx534DA!7e81!15i14416!6m2!4b1!7b1!9m0!16m4!1i100!4b1!5b1!6BQ0FFU0JrVm5TVWxEenc9PQ!17m28!1m6!1m2!1i0!2i0!2m2!1i458!2i736!1m6!1m2!1i1868!2i0!2m2!1i1918!2i736!1m6!1m2!1i0!2i0!2m2!1i1918!2i20!1m6!1m2!1i0!2i716!2m2!1i1918!2i736!18m12!1m3!1d806313.5865720833!2d150.19484835!3d-34.53825215!2m3!1f0!2f0!3f0!3m2!1i1918!2i736!4f13.1&hl=en&q=EgQjq5CYGNrv-LkGIi0qHInZSpJ9aDWuq3_PCAU4-rsGnw2QYFFKB-GSi3b1fxmkgqCu9srbeQwTKy8yAXJaAUM">here</A>.
</BODY></HTML>
System (please complete the following information):
- OS: Ubuntu
- Python version: v10
Recommendations
Near term -- add exception handling to gracefully handle when Google blocks the users IP address
Long term -- identify root cause and add logic (rate limiting, headers, other) to avoid Google blocking
Just confirming that this does appear to be a rate limiting issue. After 24 hours, google removed the block on the IP. Re-attempting to pull results for multiple users recreated the error.
Looks like this error was already addressed with commit 2534f6cb3d2d119761b8803a6163e9c871a36d07 and just hasn't made it's way into release yet. I was looking to create a pull request with the fix, but I see the fix is already in the master branch.
Hi @kodamaChameleon any plans to work on your long-term recommendations ?
Hi @kodamaChameleon any plans to work on your long-term recommendations ?
I did look into the problem. Because the issue is on Google's end and not Ghunt, your best bet for a long term solution right now is something like a rotating proxy. I can not speak on behalf of @mxrch, but given this requires additional infrastructure and/or a subscription services, I wouldn't expect this to be a supported feature of Ghunt.