Media Cloud
Media Cloud
ultimate-sitemap-parser
Ultimate Website Sitemap Parser
backend
Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online media.
sentence-splitter
Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
cliff-annotator
A lightweight server to allow HTTP requests to the Stanford Named Entity Recognized and a heavily modified CLAVIN geoparser.
web-tools
The shared repository for Media Cloud web apps (Explorer, Source Manager, Topic Mapper)
api-client
Public client for consuming content from the Media Cloud Online News Archive & Directory.
api-tutorial-notebooks
A set of jupyter notebooks demonstrating how to use the Media Cloud API.
date_guesser
A library to extract a publication date from a web page, along with a measure of the accuracy.