update_proxy not in codebase
I was trying to update the proxies, but looks like that function does not exist in the codebase. Only mention is in the docs
Problem
I would like to get markdown of 500+ websites parallelly each with a different proxy. Could I know how to do that? Right now I am able to run in batches of 10 using the same proxy for each batch setting via browser config, but this is unideal as I would like each request to be made with a different proxy/IP
https://github.com/unclecode/crawl4ai/blob/f9c601eb7e9007baa068df8d4a21de2da9ae58f0/docs/md_v2/tutorial/tutorial.md?plain=1#L237-L251
Hi, If you click on the search results to view more, you'll find that you need to build your own function to return the proxy. Meanwhile you can checkout here.
get_next_proxyfunction:- The function is responsible for providing the next proxy in the rotation. You can implement your own proxy selection logic inside this function.
For example:
import random
async def get_next_proxy():
# List of available proxies
proxies = [
"http://proxy1.com:8080",
"http://proxy2.com:8080",
"http://proxy3.com:8080"
]
# Randomly choose one proxy
next_proxy = random.choice(proxies)
return {"server": next_proxy}
@b-sai I apologize for the inconvenience. If you refer to the recently updated documentation, you will notice that some old pages (including tutorial pages) have been removed due to mistakes. Some of these mistakes originated from the language model I sued to help and did not undergo proper checks. Therefore, I generated a new set of documentation. I reviewed each document individually, and for working with the proxy, please refer to the new documentation. Here is the link you can use. https://docs.crawl4ai.com/advanced/proxy-security/
@Umpire2018 I appreciate that you are helping with multiple issues and answering questions. If you are interested, please let me know and share your email address. I will send you an invitation to our Discord channels and help you become one of our collaborators.