AskQuora icon indicating copy to clipboard operation
AskQuora copied to clipboard

No search results after a while

Open ritiek opened this issue 7 years ago • 5 comments

After looking around for some questions for a while, it stops giving results for any search term. This happens when there is not much delay between the searches. Thus, making DuckDuckGo temporarily block requests coming from the script.

Looking for a fix..

ritiek avatar Apr 06 '17 20:04 ritiek

Randomly selecting a header from a given list increases the number of requests that can be made before it gets temporarily blocked.

ritiek avatar May 08 '17 14:05 ritiek

What you are fundamentally doing is a brute search with random headers. This is not efficient.

Your goal is to search for answers in Quora. There are two alternatives I would suggest:

  • Either use Quora API for search if it exists, OR;
  • Google 'scrapy': What you want to do is basically: a. visit a search box in qoura. b. enter the keywords c. see the search result pattern: https://www.quora.com/search?q=how+do+i+make+chicken+curry d. Now you know the url pattern, you can see the pattern in the results page. e. learn some scrapy framework and use it as an Extractor to fetch and follow links from the results page only with depth limit 1 and no more, fetch the resulting html elements. Scraping uses html selector patterns.

This way you would be able to run concurrent search queries(say 16) from qoura with a given browser header (if you care so ) for multiple searches. You could then even come up with a bunch of cached queries using something like an in memory db like Redis (although this would be a small scale project) and setup a UI that acts like ytinstant.com/ for qoura.

codecakes avatar May 14 '17 20:05 codecakes

@codecakes Quora API does not exist.

https://www.quora.com/search?q=how+do+i+make+chicken+curry

If I use this (or any other search) link without logging in, it redirects me to the login page and tells me to login to see the search results.

So, the only way I can think of is scraping search engines. :confused:

ritiek avatar May 15 '17 04:05 ritiek

So, the only way I can think of is scraping search engines. :confused:

Have you tried the second option yet that I proposed?

codecakes avatar Jul 31 '17 17:07 codecakes

Let me go through your code again and suggest a PR soon. @ritiek

codecakes avatar Jul 31 '17 17:07 codecakes