AutoCrawler icon indicating copy to clipboard operation
AutoCrawler copied to clipboard

Google, Naver multiprocess image web crawler (Selenium)

Results 16 AutoCrawler issues
Sort by recently updated
recently updated
newest added

Detected OS : Windows Detected OS : Windows Error occurred while initializing chromedriver - HTTPSConnectionPool(host='chromedriver.storage.googleapis.com', port=443): Max retries exceeded with url: /103.0.5060/chromedriver_win32.zip (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))...

Hello, I am crawling using keywords in various languages, and your repo has been a tremendous help! The code worked perfectly until last week and I could get 100s ~...

Excuse me, I saw that your code has been updated, and I re-use python main.py --full true now, but it produces thumbnails instead of full images. ![naver_0003](https://user-images.githubusercontent.com/43515926/137501615-16b354b1-a9f5-45a2-b95e-65f7a1bbe312.jpg) ![naver_0004](https://user-images.githubusercontent.com/43515926/137501688-7d2bbc73-80a2-4cba-a1ee-9d8fa95e6dd0.jpg) ![naver_0006](https://user-images.githubusercontent.com/43515926/137501708-1313b85f-6007-466f-bae9-271b48638563.jpg)

`(Minseok) ubuntu@DESKTOP-SMIU2JP:~/anaconda3/envs/Minseok/Portfolio/TeamProject/AutoCrawler-master$ Xvfb :99 -ac & DISPLAY=:99 python3 main.py [1] 18306 (EE) Fatal server error: (EE) Server is already active for display 99 If this server is no longer running,...

Download failed - check_hostname requires server_hostname Downloading cat from google: 1 / 700 Download failed - check_hostname requires server_hostname Downloading cat from google: 1 / 700 Download failed - check_hostname...

When running the script , it will try to scoll down pages for one time, but after that the error will keep happening and I can`t get any link. My...

Hello, I can download thumbnails using `python main.py`, but can’t download with parameter `python main.py --full true`.Please help me, thanks! ![image](https://user-images.githubusercontent.com/43515926/104871970-8151a280-5987-11eb-9ca1-ff918f076478.png)

This project is awesome. But how could I download only limted number of images, since only images that on the first page is most important.

I set the 'full_resolution=True'. The resolutions of images are still very small. How to obtain the high quality images.

Hello, i used mode --full true --no_gui true/false but most downloaded images is low-resolution And most paths look like "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9...." ![image](https://user-images.githubusercontent.com/73953274/189272608-0504acc8-8171-45b5-833c-366fbe34ce53.png)