icrawler icon indicating copy to clipboard operation
icrawler copied to clipboard

Scrape metadata with the built-in Flickr crawler

Open spencerchubb opened this issue 5 years ago • 0 comments

class MyImageDownloader(ImageDownloader):

def __init__(self, thread_num, signal, session, storage, log_file):
    super(MyDownloader, self).__init__(thread_num, signal, session,
                                       storage)
    self.log_file = open(log_file, 'w')

def process_meta(self, task):
    if task['success']:
        with self.lock:
            self.log_file.write('{} {} {} {}\n'.format(
                task['filename'], task['file_url'], *task['img_size']))

When someone asked a question similar to mine earlier, this was the example code that solved it. This code rewrites the process_meta function so that it scrapes file name, file url, and img size. I would also like to know if there is a way to scrape photo title, description, and tags with the built-in Flickr crawler. Perhaps it is just a matter of using different keywords in the task dict?

Thanks!

spencerchubb avatar Apr 18 '20 06:04 spencerchubb