google_news_scraper_and_sentiment_analyzer icon indicating copy to clipboard operation
google_news_scraper_and_sentiment_analyzer copied to clipboard

Downloads news articles from Google news and uses pre-trained NLP models to perform sentiment analysis

Google News scraper and sentiment analyzer using python

Summary:

  • Downloads news articles by searching on http://www.news.google.com with keywords of interest and specific date ranges.
  • Generates CVS files of news text
  • Uses pre-trained NLP models to perform sentiment analysis of the news text.

google_news_scraper.py

  • Is a scraper script which can search google news for a given date range using keywords of interest.
  • sample data generated by the script looks like this. ( for keywords 'bitcoin cryptocurrency) alt text

google_news_sentiment_analysis.py

  • Takes csv file generated by google_news_scraper.py
  • Performs sentiment analysis on each cell
  • Performs flair (https://pypi.org/project/flair/), textblob (https://pypi.org/project/textblob/), and VADER (https://www.nltk.org/_modules/nltk/sentiment/vader.html) NLP processing to get sentiment scores.
  • Averages score of each row ( by each metic) to get overall sentiment analysis score. Goal is to get overall sentiment score of all the news publised on the given date.
  • Sample data generated at this stage looks like this. alt text

This framework is used in https://github.com/pratikpv/predicting_bitcoin_market

Credits:

Code from https://towardsdatascience.com/web-scraping-news-articles-in-python-9dd605799558 is referenced as base to write scraper code.