nytcrossword
nytcrossword copied to clipboard
An exploration of New York Times crossword answers from 1994-2017, i.e. the Will Shortz era.
24 Years of NYTimes Crossword answers
September 2, 2017
Description
Exploratory data analysis of 24 years of New York Times Crossword answers. I use data visualization and computational linguistics concepts to discover trends in the Shortz-era puzzles (1994 - present).
Questions include:
- What are the most common answers?
- Are words getting longer? Shorter?
- How does puzzle letter density vary by day?
- What words have emerged in the crossword only in the past few years?
- How lexically diverse are the puzzles?
Dependencies
-
tidyverse
for everything -
plyr
for data wrangling -
here
for OS-agnostic file paths -
tidytext
for text analysis methods -
stringr
for string-manipulation operations -
viridis
for a simple, colorblind-friendly palette
Data Sources
The original dataset for this project was scraped from XWordInfo.com. Upon their request, however, I have taken down my scraper code and removed the dataset from this repository. Read the notebook for more details.