search-engine-parser
search-engine-parser copied to clipboard
TrafficError
Other Search Engines working properly. But only getting TrafficError while using GoogleSearch()
!!
I am hosting on Heroku.
Here Error Logs:
2021-04-12T17:34:41.357372+00:00 app[worker.1]: ENGINE FAILURE: Google
2021-04-12T17:34:41.357383+00:00 app[worker.1]:
2021-04-12T17:34:41.360353+00:00 app[worker.1]: The result parsing was unsuccessful. It is either your query could not be found or it was flagged as unusual traffic
2021-04-12T17:34:41.360354+00:00 app[worker.1]: Traceback (most recent call last):
2021-04-12T17:34:41.360355+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/pyrogram/dispatcher.py", line 217, in handler_worker
2021-04-12T17:34:41.360355+00:00 app[worker.1]: await handler.callback(self.client, *args)
2021-04-12T17:34:41.360355+00:00 app[worker.1]: File "/app/plugins/inline.py", line 595, in answer
2021-04-12T17:34:41.360356+00:00 app[worker.1]: gresults = await g_search.async_search(gsearch, 1)
2021-04-12T17:34:41.360356+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/search_engine_parser/core/base.py", line 286, in async_search
2021-04-12T17:34:41.360357+00:00 app[worker.1]: return self.get_results(soup, **kwargs)
2021-04-12T17:34:41.360357+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/search_engine_parser/core/base.py", line 235, in get_results
2021-04-12T17:34:41.360357+00:00 app[worker.1]: raise NoResultsOrTrafficError(
2021-04-12T17:34:41.360360+00:00 app[worker.1]: search_engine_parser.core.exceptions.NoResultsOrTrafficError: The result parsing was unsuccessful. It is either your query could not be found or it was flagged as unusual traffic
Hope you guys will fix this soon.
Is it possible that heroku IP's are blocked by Google @MeNsaaH ? Or could something else be at play here
@AbirHasan2005 can you confirm if this works locally on your machine but doesn't work on the Heroku?
@MeNsaaH It's not working in both local machine & Heroku.
Is it possible that heroku IP's are blocked by Google @MeNsaaH ? Or could something else be at play here
This not possible.
1 month ago it was working properly.
Okay. Thank you for that info. It seems Google has updated their page. We'll need to update the parser for Google queries
I'll be looking into this
Okay. Thank you for that info. It seems Google has updated their page. We'll need to update the parser for Google queries
Thanks a lot sir.
Waiting for that π
Me tooπ€§
Hello @AbirHasan2005. Could you confirm if the latest version fixes this? cc @MeNsaaH
Hello @AbirHasan2005. Could you confirm if the latest version fixes this? cc @MeNsaaH
@deven96 Which sir?
search-engine-parser==0.6.2
?
Hello @AbirHasan2005. Could you confirm if the latest version fixes this? cc @MeNsaaH
@deven96 Which sir?
search-engine-parser==0.6.2
?
Pip install directly from master let's see if it works @AbirHasan2005
pip install git+https://github.com/bisoncorps/search-engine-parser
Planning a way to make sure versions get the most updated scraping logic whenever page structure changes without having to push a new pypi version. Cc @AbirHasan2005
Planning a way to make sure versions get the most updated scraping logic whenever page structure changes without having to push a new pypi version. Cc @AbirHasan2005
Tried Sir.
But new errors coming:
2021-05-13T16:13:03.816825+00:00 app[worker.1]: name 'proxy_user' is not defined
2021-05-13T16:13:03.816832+00:00 app[worker.1]: Traceback (most recent call last):
2021-05-13T16:13:03.816833+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/pyrogram/dispatcher.py", line 217, in handler_worker
2021-05-13T16:13:03.816834+00:00 app[worker.1]: await handler.callback(self.client, *args)
2021-05-13T16:13:03.816835+00:00 app[worker.1]: File "/app/plugins/inline.py", line 717, in answer
2021-05-13T16:13:03.816835+00:00 app[worker.1]: gresults = await g_search.async_search(gsearch, 1)
2021-05-13T16:13:03.816836+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 307, in async_search
2021-05-13T16:13:03.816837+00:00 app[worker.1]: soup = await self.get_soup(self.get_search_url(query, page, **kwargs), cache=cache, proxy=proxy, proxy_auth=(proxy_user, proxy_password))
2021-05-13T16:13:03.816838+00:00 app[worker.1]: NameError: name 'proxy_user' is not defined
2021-05-13T16:13:04.739774+00:00 app[worker.1]: name 'proxy_user' is not defined
2021-05-13T16:13:04.739796+00:00 app[worker.1]: Traceback (most recent call last):
2021-05-13T16:13:04.739798+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/pyrogram/dispatcher.py", line 217, in handler_worker
2021-05-13T16:13:04.739798+00:00 app[worker.1]: await handler.callback(self.client, *args)
2021-05-13T16:13:04.739799+00:00 app[worker.1]: File "/app/plugins/inline.py", line 717, in answer
2021-05-13T16:13:04.739800+00:00 app[worker.1]: gresults = await g_search.async_search(gsearch, 1)
2021-05-13T16:13:04.739801+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 307, in async_search
2021-05-13T16:13:04.739802+00:00 app[worker.1]: soup = await self.get_soup(self.get_search_url(query, page, **kwargs), cache=cache, proxy=proxy, proxy_auth=(proxy_user, proxy_password))
2021-05-13T16:13:04.739803+00:00 app[worker.1]: NameError: name 'proxy_user' is not defined
In new version I have to change code codes? Any parameter changed?
Sorry it seemed the async search had been faulty with the addition of proxy. Could you try again @AbirHasan2005
Sorry it seemed the async search had been faulty with the addition of proxy. Could you try again @AbirHasan2005
Sorry Sir @deven96,
Same NoResultsOrTrafficError
Coming.
Errors Here.
Sorry it seemed the async search had been faulty with the addition of proxy. Could you try again @AbirHasan2005
Sorry Sir @deven96, Same
NoResultsOrTrafficError
Coming.Errors Here.
Seems to work locally , could you try to replicate locally and not heroku so we can narrow it down?
Sorry it seemed the async search had been faulty with the addition of proxy. Could you try again @AbirHasan2005
Sorry Sir @deven96, Same
NoResultsOrTrafficError
Coming. Errors Here.Seems to work locally , could you try to replicate locally and not heroku so we can narrow it down?
Yes Sir. It works well locally. Tested on Windows 10.
So why not working for Heroku??
NoResultsorTraffic typically means the structure of the page received is not for scraping, e.g captcha pages Should we insert some debug statements to view the html actually being retrieved, therein lies our answer @AbirHasan2005
Before it was working on Heroku. Don't know suddenly what happened. Any suggestions by you to fix this??
Is it narrowed down to that particular app or any heroku app?
Is it narrowed down to that particular app or any heroku app?
Same issue for all Heroku apps.
This is my friend's issue: #142
He also getting NoResultsOrTrafficError
...
Not only him. Everyone getting same issue who running on Heroku.
Maybe this will help you
File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 241, in get_results
search_results = self.parse_result(results, **kwargs)
File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 150, in parse_result
rdict = self.parse_single_result(each, **kwargs)
File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/engines/google.py", line 77, in parse_single_result
title = link_tag.find('h3').text
AttributeError: 'NoneType' object has no attribute 'text'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/TeamUltroid/plugins/devtools.py", line 132, in _
await aexec(cmd, event)
File "/root/TeamUltroid/plugins/devtools.py", line 181, in aexec
return await locals()["__aexec"](event, event.client)
File "<string>", line 8, in __aexec
File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 287, in async_search
return self.get_results(soup, **kwargs)
File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 244, in get_results
raise NoResultsOrTrafficError(
search_engine_parser.core.exceptions.NoResultsOrTrafficError: The returned results could not be parsed. This might be due to site updates or server errors. Drop an issue at https://github.com/bisoncorps/search-engine-parser if this persists```
Maybe this will help you
File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 241, in get_results search_results = self.parse_result(results, **kwargs) File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 150, in parse_result rdict = self.parse_single_result(each, **kwargs) File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/engines/google.py", line 77, in parse_single_result title = link_tag.find('h3').text AttributeError: 'NoneType' object has no attribute 'text' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/root/TeamUltroid/plugins/devtools.py", line 132, in _ await aexec(cmd, event) File "/root/TeamUltroid/plugins/devtools.py", line 181, in aexec return await locals()["__aexec"](event, event.client) File "<string>", line 8, in __aexec File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 287, in async_search return self.get_results(soup, **kwargs) File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 244, in get_results raise NoResultsOrTrafficError( search_engine_parser.core.exceptions.NoResultsOrTrafficError: The returned results could not be parsed. This might be due to site updates or server errors. Drop an issue at https://github.com/bisoncorps/search-engine-parser if this persists```
@MeNsaaH check
@AbirHasan2005 can you confirm if this works locally on your machine but doesn't work on the Heroku?
Sir in a Telegram Userbot it's working fine but in a Telegram bot it's showing this error (on Heroku) it sometimes show that error (depend on code)
async def _(event):
if event.fwd_from:
return
webevent = await event.reply("searching........")
match = event.pattern_match.group(1)
page = re.findall(r"page=\d+", match)
try:
page = page[0]
page = page.replace("page=", "")
match = match.replace("page=" + page[0], "")
except IndexError:
page = 1
search_args = (str(match), int(page))
gsearch = GoogleSearch()
gresults = await gsearch.async_search(*search_args)
msg = ""
for i in range(len(gresults["links"])):
try:
title = gresults["titles"][i]
link = gresults["links"][i]
desc = gresults["descriptions"][i]
msg += f"β[{title}]({link})\n**{desc}**\n\n"
except IndexError:
break
await webevent.edit(
"**Search Query:**\n`" + match + "`\n\n**Results:**\n" + msg, link_preview=False
)
I used this code in bot which showed this error.
Not only him. Everyone getting same issue who running on Heroku.
why it's working fine in a telegram userbot? (which is also deployed on heroku.)
It seems there's an issue with heroku blacklisting requests to google.
It seems there's an issue with heroku blacklisting requests to google.
Yes. It's working on replit.com π