GoogleScraper icon indicating copy to clipboard operation
GoogleScraper copied to clipboard

Issue with proxy file

Open lorenzoromani1983 opened this issue 6 years ago • 5 comments

Hi. I am not able to run the proxy file. I have it formatted this way:

Socks4 182.48.90.81:1080 Socks4 36.37.225.50:33012

it is a simple txt file with many rows i get this error message:

Invalid proxy file. Should have the following format: {}'.format(parse_proxy_file.doc)) Exception: Invalid proxy file. Should have the following format: Parses a proxy file

please, help :)

lorenzoromani1983 avatar Apr 25 '18 21:04 lorenzoromani1983

This project hasn't been updated for more than a year. Prefer using this project, developed from this one: https://github.com/fassn/SerpScrap (this is just a fork, I didn't started the project)

fassn avatar Apr 25 '18 23:04 fassn

thanks. but that, as far as i understand, needs to be used within code. this is a "stand-alone" tool...which is easier for me (not a coder, not an expert at all). did you manage to make the proxies work in http mode? i need to get a json/csv of 700/800 kw from google.

lorenzoromani1983 avatar Apr 26 '18 21:04 lorenzoromani1983

Yes, I use proxies on the SerpScrap projet without problems

fassn avatar Apr 27 '18 00:04 fassn

THanks, it looks neat. However, i ran into two main problems:

1) I cant save to CSV: how to proceed? this is the error i get:

Traceback (most recent call last): File "C:\Users\Lorenzo\Anaconda\lib\site-packages\serpscrap\csv_writer.py", line 10, in write with open(file_name, 'w', encoding='utf-8', newline='') as f: PermissionError: [Errno 13] Permission denied: 'c:/.csv' None Traceback (most recent call last): File "C:\Users\Lorenzo\Anaconda\lib\site-packages\serpscrap\csv_writer.py", line 10, in write with open(file_name, 'w', encoding='utf-8', newline='') as f: PermissionError: [Errno 13] Permission denied: 'c:/.csv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\Lorenzo\Desktop\scraper.txt", line 17, in results = scrap.as_csv('c:/') File "C:\Users\Lorenzo\Anaconda\lib\site-packages\serpscrap\serpscrap.py", line 134, in as_csv writer.write(file_path + '.csv', self.results) File "C:\Users\Lorenzo\Anaconda\lib\site-packages\serpscrap\csv_writer.py", line 17, in write raise Exception Exception

2) when I try to scrape many kw with proxies, I get this error, and end up banned by Google

2018-04-28 14:00:13,048 - scrapcore.scraper.selenium - WARNING - 'NoneType' object has no attribute 'group' I am using proxies from free online proxy lists. Maybe they are already blacklisted by google?

lorenzoromani1983 avatar Apr 28 '18 08:04 lorenzoromani1983

hi,

  • for question 1 can you please open an issue in the serpscrap project (https://github.com/ecoron/SerpScrap/issues)
  • for question 2: yes most of free online proxies are banned or blocked by google

ecoron avatar May 08 '18 15:05 ecoron