GoogleScraper
GoogleScraper copied to clipboard
AttributeError: ‘ScraperSearch’ object has no attribute ‘query’
After re-installing GoogleScraper, this time in virtualenv, now the error message is changed, and it seems to be related to googlescraper itself.
I run the following script:
!/usr/bin/python3.4
-- coding: utf-8 --
https://github.com/NikolaiT/GoogleScraper
Shows how to control GoogleScraper programmatically
import sys import GoogleScraper from GoogleScraper import scrape_with_config, GoogleSearchError from GoogleScraper.database import ScraperSearch, SERP, Link
EXAMPLES OF HOW TO USE GoogleScraper
very basic usage
def basic_usage():
See in the config.cfg file for possible values
config = { ‘SCRAPING': { ‘use_own_ip': ‘True’, ‘keyword': ‘Let\’s go bubbles!’, ‘search_engines': ‘yandex’, ‘num_pages_for_keyword': 1 }, ‘SELENIUM': { ‘sel_browser': ‘chrome’, }, ‘GLOBAL': { ‘do_caching': ‘False’ } }
try: sqlalchemy_session = scrape_with_config(config) except GoogleSearchError as e: print(e)
let’s inspect what we got
for search in sqlalchemy_session.query(ScraperSearch).all(): for serp in search.serps: print(serp) for link in serp.links: print(link)
simulating a image search for all search engines that support image search
then download all found images :)
MAIN FUNCTION
if name == ‘main':
usage = ‘Usage: {} [basic|image-search]’.format(sys.argv[0]) if len(sys.argv) != 2: print(usage) else: arg = sys.argv[1] if arg == ‘basic': basic_usage() elif arg == ‘image': image_search() else: print(usage)
and the output is:
time python3 google_scraper_example.py ‘basic’ 2015-02-02 15:43:24,708 – GoogleScraper – INFO – Going to scrape 1 keywords with 1 proxies by using 1 threads. 2015-02-02 15:43:24,709 – GoogleScraper – INFO – [+] SelScrape[localhost][search-type:normal][http://yandex.ru/yandsearch?] using search engine “yandex”. Num keywords=1, num pages for keyword=[1] 2015-02-02 15:44:25,763 – GoogleScraper – ERROR – Message: unknown error: Chrome failed to start: exited abnormally (Driver info: chromedriver=2.13.307649 (bf55b442bb6b5c923249dd7870d6a107678bfbb6),platform=Linux 3.13.0-32-generic x86_64)
Exception in thread [yandex]SelScrape: Traceback (most recent call last): File “/usr/lib/python3.4/threading.py”, line 920, in _bootstrap_inner self.run() File “/home/marco/crawlscrape/env/lib/python3.4/site-packages/GoogleScraper/selenium_mode.py”, line 494, in run raise_or_log(‘{}: Aborting due to no available selenium webdriver.’.format(self.name), exception_obj=SeleniumMisconfigurationError) File “/home/marco/crawlscrape/env/lib/python3.4/site-packages/GoogleScraper/log.py”, line 30, in raise_or_log raise exception_obj(msg) GoogleScraper.scraping.SeleniumMisconfigurationError: [yandex]SelScrape: Aborting due to no available selenium webdriver.
Traceback (most recent call last): File “google_scraper_example.py”, line 63, in basic_usage() File “google_scraper_example.py”, line 41, in basic_usage for search in sqlalchemy_session.query(ScraperSearch).all(): AttributeError: ‘ScraperSearch’ object has no attribute ‘query’
Any suggestions Nikolai?
remove usage file code:
print(serp)
the version have some bugs, the serp have no query attribute if read from cache file.
I'm not sure if I understood your kind suggestion. I commented (making them inactive) the following lines:
for search in sqlalchemy_session.query(ScraperSearch).all():
for serp in search.serps:
print(serp)
for link in serp.links:
print(link)
time python3 google_scraper_example.py 'basic' 2015-02-03 10:07:32,754 - GoogleScraper - INFO - Going to scrape 1 keywords with 1 proxies by using 1 threads. 2015-02-03 10:07:32,754 - GoogleScraper - INFO - [+] SelScrape[localhost][search-type:normal][http://yandex.ru/yandsearch?] using search engine "yandex". Num keywords=1, num pages for keyword=1 2015-02-03 10:08:33,843 - GoogleScraper - ERROR - Message: unknown error: Chrome failed to start: exited abnormally (Driver info: chromedriver=2.13.307649 (bf55b442bb6b5c923249dd7870d6a107678bfbb6),platform=Linux 3.13.0-32-generic x86_64)
Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner self.run() File "/usr/local/lib/python3.4/dist-packages/GoogleScraper/selenium_mode.py", line 419, in run raise SeleniumMisconfigurationError('Aborting due to no available selenium webdriver.') GoogleScraper.scraping.SeleniumMisconfigurationError: Aborting due to no available selenium webdriver.
i have no help. i use win7 not linux. your should search the problem about chromedriver run on linux.
maybe your chromedriver version is not support.
The same problem on Win 8.1 x64
Try modify GoogleScrapper/core.py last line
if return_results: return scraper_search
should be
if return_results: return session
hope this helps
Hey i am getting this error
Traceback (most recent call last):
File "test.py", line 137, in
hey got the answer with this Try modify GoogleScrapper/core.py last line
if return_results: return scraper_search
should be
if return_results: return session
Hi , I have this issue when I try to run an script. Do you know why?
'scrape_method': 'http',
'search_engine_name': 'google',
'status': 'successful'}
2016-02-09 10:43:07,836 - GoogleScraper.caching - INFO - 2 cache files found in .scrapecache/
2016-02-09 10:43:07,837 - GoogleScraper.caching - INFO - 1/1 objects have been read from the cache. 0 remain to get scraped.
Traceback (most recent call last):
File "test.py", line 142, in
hey mavverick Modify GoogleScrapper/core.py file, In that change "return scraper_search" to "return session"