whoogle-search
whoogle-search copied to clipboard
[FEATURE] Multiple Proxies
Is is possible to add multiple proxies and select the least latency proxy in real time?
@benbusby I would like to give this a try. However, I would need guidance on it 😅. I don't know, maybe some resources to refer too?
@benbusby I will take it up
Thanks @vacom13, and sorry I missed your message from back in December! I think for an initial implementation you could just add support for multiple proxies, and retry requests using a different proxy (if multiple are configured) if one times out. Selecting the least latency proxy can be added later as an improvement.
@benbusby no problem. And yes I will look into it
@benbusby Hey. I have actually been really busy with family and college work. I did think up a way to implement this I suppose. Correct me if I am wrong. For now I should probably just get a list of working proxies from a site and give the user a checkbox in the configs to select if he wants to use the proxies right? And them I need to make sure that the search results come within the given timeframe. After that I can give the user the ability to select the latency right?
@benbusby there is this python library free-proxies
. It basically scrapes for proxies on https://www.sslproxies.org/
and gives a string for working proxies. I could use that and incase it times out, I could try to get another proxy. As it gets the proxy in real time, I suppose there shouldnt be a problem as according to the documentation, it checks whether the proxy is working.
Or I could just scrape the list myself for the first 5 proxies and then check those out?
Hey @vacom13, I think that's actually a good idea, but potentially a bit out of scope for this issue. I think all this issue should really support is multiple user specified proxies. So if a user has access to multiple, they can specify them as a comma separated string (or something along those lines).
So currently WHOOGLE_PROXY_LOC
only accepts one IP:PORT
string, but could be updated to have multiple and cycle through them if one of them returns an error response code. If a user just specified the proxy locations as a comma separated string such as IP:4000,IP:4001
, then in the request module we could do something like:
proxy_paths = os.environ.get('WHOOGLE_PROXY_LOC', '').split(',')
if proxy_paths:
# ...
for path in proxy_paths:
# Validate a 200 response and no captcha from search URL
I think your idea could be an entirely separate issue, but I'd like to personally look into the free-proxies
library a bit first.
@benbusby i have tried to work on this but it's just that not having multiple working proxies available just makes it confusing. I used a couple of free proxies but i ran into internal errors maybe because the connection keeps timing out. The free proxies did work in a test script i created but whenever i add the proxy to the whoogle env and then run, it always leads to some problem. I did also run into a rate limiting issue with a proxy. Anyway, I will keep at it.