requests-ip-rotator
requests-ip-rotator copied to clipboard
aiohttp-ip-rotator
Hey, I am back again:) I was happily parsing last few days away, blew past the free tier already, but I wanted to scale up my operation further to make it even faster. I am limited with max 60 workers in the pool, so I decided to rewrite it from multiprocessing to asynchronous concurrency. This is where I realized that I can't use Requests module, but your module was made to work only with Requests.
How difficult would it be to rewrite it to make it compatible with aiohttp? Is there any way to make it work with requests, even thou the module is inherently blocking at the sockets level? The higher latency, the more beneficial would it be to move this work from multiprocessing to asynchronous loop. I could try to work on it, but I am new to python so I appreciate any kind of feedback or advice you can give me.
I just found Dask and am looking into it, if it could help me keep using Requests. Another possibility is to rent a server that has enough virtual cores to go beyond 60 workers.
Hi! If you go to the PR tab, the latest PR shows a version which supports this. I'm looking to complete the changes and merge it in early next month, if you'd rather wait, but if not then I'd recommend cloning the PR'd fork and using the example given in the thread. Hope this helps! :)
Great stuff! I already installed the fork and am now going through the class Harry wrote. I am slightly lost due to also being new to every other module in the project, but I think I'll eventually get it, emphasis on eventually. A usage case of the new class would be much appreciated and would explain what everything does at a surface level.
Currently stuck at where labeled_urls came from, really hard when there are no variable or return types.
Hi! I was near starting to write something same but then found this project. Great job! How is going your new project with aiohttp? Is release close?
Hey @ZOV-code, still very much a WIP and nothing publicly available for 2-4 months I'm afraid!
@Ge0rg3 thank you for the information
Hi! There is any update on this? I am trying to run my code asynchronously but i dont find a way to do it with this approach. Thank you
Hi @jherrerogb98 , I probably won't get around to implementing this for another couple of months at the very least. However, multithreading via threading or otherwise parallel programming such as multiprocessing should work fine. I hope this helps 😄
Okay, thank you!
Hello! I also needed a similar asynchronous library and I made an implementation of aiohttp-ip-rotator. You can find it here: https://github.com/D4rkwat3r/aiohttp-ip-rotator
Hi all, closing this issue as the aiohttp code will not be merged into this project. If you are set on on aiohttp, then the aiohttp-ip-rotator lib is probably a good fit.
Depending on your use case, aiohttp may be faster than threading requests.
However, you can also run async requests with this lib via:
import requests as rq
import concurrent.futures
from requests_ip_rotator import ApiGateway
site = "https://bbc.co.uk"
gateway = ApiGateway(site)
gateway.pool_connections = 30
gateway.pool_maxsize = 30
gateway.start()
session = rq.Session()
session.mount(site, gateway)
with concurrent.futures.ThreadPoolExecutor(max_workers=25) as executor:
futures_map = {}
# Trigger 100 requests
for i in range(100):
url = site + "/" + str(i)
future = executor.submit(session.get, url)
futures_map[future] = url
# Collect results
for future in concurrent.futures.as_completed(futures_map):
# Check for error
error = future.exception()
url = futures_map[future]
if error:
print(f"Error for {url}: {error}")
continue
# Get response
response = future.result()
print(f"{url} - {response.status_code}")