crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

update_proxy not in codebase

Open b-sai opened this issue 1 year ago • 1 comments

I was trying to update the proxies, but looks like that function does not exist in the codebase. Only mention is in the docs

image

Problem

I would like to get markdown of 500+ websites parallelly each with a different proxy. Could I know how to do that? Right now I am able to run in batches of 10 using the same proxy for each batch setting via browser config, but this is unideal as I would like each request to be made with a different proxy/IP

b-sai avatar Jan 07 '25 22:01 b-sai

https://github.com/unclecode/crawl4ai/blob/f9c601eb7e9007baa068df8d4a21de2da9ae58f0/docs/md_v2/tutorial/tutorial.md?plain=1#L237-L251

Hi, If you click on the search results to view more, you'll find that you need to build your own function to return the proxy. Meanwhile you can checkout here.

  • get_next_proxy function:
    • The function is responsible for providing the next proxy in the rotation. You can implement your own proxy selection logic inside this function.

For example:

import random

async def get_next_proxy():
   # List of available proxies
   proxies = [
       "http://proxy1.com:8080",
       "http://proxy2.com:8080",
       "http://proxy3.com:8080"
   ]
   
   # Randomly choose one proxy
   next_proxy = random.choice(proxies)
   
   return {"server": next_proxy}

Umpire2018 avatar Jan 12 '25 08:01 Umpire2018

@b-sai I apologize for the inconvenience. If you refer to the recently updated documentation, you will notice that some old pages (including tutorial pages) have been removed due to mistakes. Some of these mistakes originated from the language model I sued to help and did not undergo proper checks. Therefore, I generated a new set of documentation. I reviewed each document individually, and for working with the proxy, please refer to the new documentation. Here is the link you can use. https://docs.crawl4ai.com/advanced/proxy-security/

unclecode avatar Jan 13 '25 12:01 unclecode

@Umpire2018 I appreciate that you are helping with multiple issues and answering questions. If you are interested, please let me know and share your email address. I will send you an invitation to our Discord channels and help you become one of our collaborators.

unclecode avatar Jan 13 '25 12:01 unclecode