sms-analysis
sms-analysis copied to clipboard
Python/IPython code to analyze one's text messages. Intended to work out of the box, see README for details.
sms-analysis
Python/IPython code to analyze one's text messages. Intended to work out of the box.
Author: Michael Dezube <michael dezube at gmail dot com>
Overview of code
This code will:
- Find your latest iPhone sync (currently only supports doing this automatically on Macs), for PCs edit
table_connector.py
to find the file - Load up the messages database and address book database locally
- Merge the databases together into
fully_merged_messages_df
which you can freely play with - Visualize a word tree of your text messages with a specific contact, see word tree screenshot
- Show you who you text the most
- Create an interactive streamgraph to visualize how your texting with people has trended over time, see steamgraph screenshot
- Create a word cloud of the words you use, and those used by your contacts, see word cloud screenshot
- Use TFIDF to understand what words identify your contacts' verbiage
- Use TFIDF to understand what words identify the difference between contacts' verbiage. For example: how do high school friends talk differently from college friends, see tfidf contact comparison
- Use TFIDF to show you what topics were popular in texts you sent, or texts sent to you, and how this progressed over the years
Note: none of your data is modified nor sent anywhere during execution
Dependencies easy install
If you don't have pip, see https://pip.pypa.io/en/stable/installing/, or if using a Mac run sudo easy_install pip
Then run pip install -r requirements.txt
and pip install "matplotlib>=1.4"
If the second comamnd fails, then you'll have to follow these detailed Matplotlib install instructions
Dependencies with details
- Pandas
- IPython
-
Matplotlib
- The majority of the code will work without this, but certain graphs will fail
- An iPhone, having synced with this computer
- If running on a Mac, code will work out of the box. If running on a PC, change the variable
BASE_DIR
intable_connector.py
to the directory of your backups- This post seems to specify the location of backups on Windows.
- Internet connection to load the google visualization API, it's a very small file though
Quick Start - Jupyter Notebook
- Start the IPython notebook like so:
jupyter notebook sms_analysis.ipynb
- Under the menu choose Cell --> Run All
- Edit the
CONTACT_NAME
andROOT_WORD
in the last cell to alter the visualization and then re-run that cell, under menu choose: Cell --> Run Cell
Quick Start - Command Line
- Run
python table_connector.py
to see a sample of the messages and address book data - Run
python table_connector.py --full
to see a sample of the messages and address book data with all of their columns - Run
python table_connector.py <output directory>
to output the messages and address book data into CSV files - Run
python table_connector.py --full <output directory>
to output the messages and address book data into CSV files with all of their columns - SEE THE ARGS DOCUMENTATION:
python table_connector.py --help
to see the arguments and their options
Screenshots from running the code
Example word tree

Example steamgraph

Example word cloud

Example TFIDF contact comparison

Example of Clustering
