search-engine-parser icon indicating copy to clipboard operation
search-engine-parser copied to clipboard

TrafficError

Open AbirHasan2005 opened this issue 3 years ago β€’ 31 comments

Other Search Engines working properly. But only getting TrafficError while using GoogleSearch() !! I am hosting on Heroku.

Here Error Logs:

2021-04-12T17:34:41.357372+00:00 app[worker.1]: ENGINE FAILURE: Google
2021-04-12T17:34:41.357383+00:00 app[worker.1]: 
2021-04-12T17:34:41.360353+00:00 app[worker.1]: The result parsing was unsuccessful. It is either your query could not be found or it was flagged as unusual traffic
2021-04-12T17:34:41.360354+00:00 app[worker.1]: Traceback (most recent call last):
2021-04-12T17:34:41.360355+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/pyrogram/dispatcher.py", line 217, in handler_worker
2021-04-12T17:34:41.360355+00:00 app[worker.1]: await handler.callback(self.client, *args)
2021-04-12T17:34:41.360355+00:00 app[worker.1]: File "/app/plugins/inline.py", line 595, in answer
2021-04-12T17:34:41.360356+00:00 app[worker.1]: gresults = await g_search.async_search(gsearch, 1)
2021-04-12T17:34:41.360356+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/search_engine_parser/core/base.py", line 286, in async_search
2021-04-12T17:34:41.360357+00:00 app[worker.1]: return self.get_results(soup, **kwargs)
2021-04-12T17:34:41.360357+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.8/site-packages/search_engine_parser/core/base.py", line 235, in get_results
2021-04-12T17:34:41.360357+00:00 app[worker.1]: raise NoResultsOrTrafficError(
2021-04-12T17:34:41.360360+00:00 app[worker.1]: search_engine_parser.core.exceptions.NoResultsOrTrafficError: The result parsing was unsuccessful. It is either your query could not be found or it was flagged as unusual traffic

Hope you guys will fix this soon.

AbirHasan2005 avatar Apr 12 '21 17:04 AbirHasan2005

Is it possible that heroku IP's are blocked by Google @MeNsaaH ? Or could something else be at play here

deven96 avatar May 09 '21 09:05 deven96

@AbirHasan2005 can you confirm if this works locally on your machine but doesn't work on the Heroku?

MeNsaaH avatar May 09 '21 09:05 MeNsaaH

@MeNsaaH It's not working in both local machine & Heroku.

AbirHasan2005 avatar May 09 '21 12:05 AbirHasan2005

Is it possible that heroku IP's are blocked by Google @MeNsaaH ? Or could something else be at play here

This not possible.

AbirHasan2005 avatar May 09 '21 12:05 AbirHasan2005

1 month ago it was working properly.

AbirHasan2005 avatar May 09 '21 12:05 AbirHasan2005

Okay. Thank you for that info. It seems Google has updated their page. We'll need to update the parser for Google queries

MeNsaaH avatar May 09 '21 12:05 MeNsaaH

I'll be looking into this

MeNsaaH avatar May 09 '21 12:05 MeNsaaH

Okay. Thank you for that info. It seems Google has updated their page. We'll need to update the parser for Google queries

Thanks a lot sir.

AbirHasan2005 avatar May 09 '21 12:05 AbirHasan2005

Waiting for that πŸ™‚

New-dev0 avatar May 13 '21 13:05 New-dev0

Me too🀧

buddhhu avatar May 13 '21 13:05 buddhhu

Hello @AbirHasan2005. Could you confirm if the latest version fixes this? cc @MeNsaaH

deven96 avatar May 13 '21 14:05 deven96

Hello @AbirHasan2005. Could you confirm if the latest version fixes this? cc @MeNsaaH

@deven96 Which sir?

search-engine-parser==0.6.2 ?

AbirHasan2005 avatar May 13 '21 15:05 AbirHasan2005

Hello @AbirHasan2005. Could you confirm if the latest version fixes this? cc @MeNsaaH

@deven96 Which sir?

search-engine-parser==0.6.2 ?

Pip install directly from master let's see if it works @AbirHasan2005

pip install git+https://github.com/bisoncorps/search-engine-parser

deven96 avatar May 13 '21 15:05 deven96

Planning a way to make sure versions get the most updated scraping logic whenever page structure changes without having to push a new pypi version. Cc @AbirHasan2005

deven96 avatar May 13 '21 16:05 deven96

Planning a way to make sure versions get the most updated scraping logic whenever page structure changes without having to push a new pypi version. Cc @AbirHasan2005

Tried Sir.

But new errors coming:

2021-05-13T16:13:03.816825+00:00 app[worker.1]: name 'proxy_user' is not defined
2021-05-13T16:13:03.816832+00:00 app[worker.1]: Traceback (most recent call last):
2021-05-13T16:13:03.816833+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/pyrogram/dispatcher.py", line 217, in handler_worker
2021-05-13T16:13:03.816834+00:00 app[worker.1]: await handler.callback(self.client, *args)
2021-05-13T16:13:03.816835+00:00 app[worker.1]: File "/app/plugins/inline.py", line 717, in answer
2021-05-13T16:13:03.816835+00:00 app[worker.1]: gresults = await g_search.async_search(gsearch, 1)
2021-05-13T16:13:03.816836+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 307, in async_search
2021-05-13T16:13:03.816837+00:00 app[worker.1]: soup = await self.get_soup(self.get_search_url(query, page, **kwargs), cache=cache, proxy=proxy, proxy_auth=(proxy_user, proxy_password))
2021-05-13T16:13:03.816838+00:00 app[worker.1]: NameError: name 'proxy_user' is not defined
2021-05-13T16:13:04.739774+00:00 app[worker.1]: name 'proxy_user' is not defined
2021-05-13T16:13:04.739796+00:00 app[worker.1]: Traceback (most recent call last):
2021-05-13T16:13:04.739798+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/pyrogram/dispatcher.py", line 217, in handler_worker
2021-05-13T16:13:04.739798+00:00 app[worker.1]: await handler.callback(self.client, *args)
2021-05-13T16:13:04.739799+00:00 app[worker.1]: File "/app/plugins/inline.py", line 717, in answer
2021-05-13T16:13:04.739800+00:00 app[worker.1]: gresults = await g_search.async_search(gsearch, 1)
2021-05-13T16:13:04.739801+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 307, in async_search
2021-05-13T16:13:04.739802+00:00 app[worker.1]: soup = await self.get_soup(self.get_search_url(query, page, **kwargs), cache=cache, proxy=proxy, proxy_auth=(proxy_user, proxy_password))
2021-05-13T16:13:04.739803+00:00 app[worker.1]: NameError: name 'proxy_user' is not defined

In new version I have to change code codes? Any parameter changed?

AbirHasan2005 avatar May 13 '21 16:05 AbirHasan2005

Sorry it seemed the async search had been faulty with the addition of proxy. Could you try again @AbirHasan2005

deven96 avatar May 13 '21 16:05 deven96

Sorry it seemed the async search had been faulty with the addition of proxy. Could you try again @AbirHasan2005

Sorry Sir @deven96, Same NoResultsOrTrafficError Coming.

Errors Here.

AbirHasan2005 avatar May 13 '21 17:05 AbirHasan2005

Sorry it seemed the async search had been faulty with the addition of proxy. Could you try again @AbirHasan2005

Sorry Sir @deven96, Same NoResultsOrTrafficError Coming.

Errors Here.

Seems to work locally , could you try to replicate locally and not heroku so we can narrow it down?

deven96 avatar May 13 '21 18:05 deven96

Sorry it seemed the async search had been faulty with the addition of proxy. Could you try again @AbirHasan2005

Sorry Sir @deven96, Same NoResultsOrTrafficError Coming. Errors Here.

Seems to work locally , could you try to replicate locally and not heroku so we can narrow it down?

Yes Sir. It works well locally. Tested on Windows 10.

So why not working for Heroku??

AbirHasan2005 avatar May 14 '21 07:05 AbirHasan2005

NoResultsorTraffic typically means the structure of the page received is not for scraping, e.g captcha pages Should we insert some debug statements to view the html actually being retrieved, therein lies our answer @AbirHasan2005

deven96 avatar May 14 '21 19:05 deven96

Before it was working on Heroku. Don't know suddenly what happened. Any suggestions by you to fix this??

AbirHasan2005 avatar May 14 '21 19:05 AbirHasan2005

Is it narrowed down to that particular app or any heroku app?

deven96 avatar May 14 '21 19:05 deven96

Is it narrowed down to that particular app or any heroku app?

Same issue for all Heroku apps.

This is my friend's issue: #142

He also getting NoResultsOrTrafficError ...

AbirHasan2005 avatar May 14 '21 20:05 AbirHasan2005

Not only him. Everyone getting same issue who running on Heroku.

AbirHasan2005 avatar May 14 '21 20:05 AbirHasan2005

Maybe this will help you

  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 241, in get_results
    search_results = self.parse_result(results, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 150, in parse_result
    rdict = self.parse_single_result(each, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/engines/google.py", line 77, in parse_single_result
    title = link_tag.find('h3').text
AttributeError: 'NoneType' object has no attribute 'text'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/TeamUltroid/plugins/devtools.py", line 132, in _
    await aexec(cmd, event)
  File "/root/TeamUltroid/plugins/devtools.py", line 181, in aexec
    return await locals()["__aexec"](event, event.client)
  File "<string>", line 8, in __aexec
  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 287, in async_search
    return self.get_results(soup, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 244, in get_results
    raise NoResultsOrTrafficError(
search_engine_parser.core.exceptions.NoResultsOrTrafficError: The returned results could not be parsed. This might be due to site updates or server errors. Drop an issue at https://github.com/bisoncorps/search-engine-parser if this persists```

buddhhu avatar Jun 27 '21 11:06 buddhhu

Maybe this will help you

  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 241, in get_results
    search_results = self.parse_result(results, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 150, in parse_result
    rdict = self.parse_single_result(each, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/engines/google.py", line 77, in parse_single_result
    title = link_tag.find('h3').text
AttributeError: 'NoneType' object has no attribute 'text'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/TeamUltroid/plugins/devtools.py", line 132, in _
    await aexec(cmd, event)
  File "/root/TeamUltroid/plugins/devtools.py", line 181, in aexec
    return await locals()["__aexec"](event, event.client)
  File "<string>", line 8, in __aexec
  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 287, in async_search
    return self.get_results(soup, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 244, in get_results
    raise NoResultsOrTrafficError(
search_engine_parser.core.exceptions.NoResultsOrTrafficError: The returned results could not be parsed. This might be due to site updates or server errors. Drop an issue at https://github.com/bisoncorps/search-engine-parser if this persists```

@MeNsaaH check

buddhhu avatar Jul 02 '21 14:07 buddhhu

@AbirHasan2005 can you confirm if this works locally on your machine but doesn't work on the Heroku?

Sir in a Telegram Userbot it's working fine but in a Telegram bot it's showing this error (on Heroku) it sometimes show that error (depend on code)

async def _(event):
    if event.fwd_from:
        return
    webevent = await event.reply("searching........")
    match = event.pattern_match.group(1)
    page = re.findall(r"page=\d+", match)
    try:
        page = page[0]
        page = page.replace("page=", "")
        match = match.replace("page=" + page[0], "")
    except IndexError:
        page = 1
    search_args = (str(match), int(page))
    gsearch = GoogleSearch()
    gresults = await gsearch.async_search(*search_args)
    msg = ""
    for i in range(len(gresults["links"])):
        try:
            title = gresults["titles"][i]
            link = gresults["links"][i]
            desc = gresults["descriptions"][i]
            msg += f"❍[{title}]({link})\n**{desc}**\n\n"
        except IndexError:
            break
    await webevent.edit(
        "**Search Query:**\n`" + match + "`\n\n**Results:**\n" + msg, link_preview=False
    )

I used this code in bot which showed this error.

itzzzyashu avatar Jan 03 '22 02:01 itzzzyashu

Not only him. Everyone getting same issue who running on Heroku.

why it's working fine in a telegram userbot? (which is also deployed on heroku.)

itzzzyashu avatar Jan 03 '22 02:01 itzzzyashu

It seems there's an issue with heroku blacklisting requests to google.

MeNsaaH avatar Jan 04 '22 13:01 MeNsaaH

It seems there's an issue with heroku blacklisting requests to google.

Yes. It's working on replit.com πŸ™ƒ

AbirHasan2005 avatar Jan 04 '22 16:01 AbirHasan2005