twarch
twarch copied to clipboard
A Twitter Archive thing
twarch
Import your Twitter Archive into a SQLite database and then do stuff with it.
Requirements
- PHP 5.3 or newer
- SQLite PHP Extension
- Linux of some description
- Composer
Installation
Either:
- Download the tarball
- Extract it with
tar xzf twarch-0.0.2.tgz
-
cd
into the newly created directory
Or:
- Clone the repo with
git clone git://github.com/TomNomNom/twarch.git
-
cd
into the newly created directory - Run
composer install
to get the dependencies
Importing your Twitter Archive
-
Unzip your Twitter Arhive somewhere. In this example, mine is extracted into a directory called 'tweets':
▶ unzip tweets.zip -d tweets
-
Create an empty DB using the
createdb
mode:▶ php twarch.php createdb Successfully created DB
-
Import your Tweets from the JS files in the archive by using the
import
mode:▶ php twarch.php import tweets/data/js/tweets/*.js Removing old Tweets... Importing Tweets from [tweets/data/js/tweets/2008_11.js] Importing Tweets from [tweets/data/js/tweets/2008_12.js] ... Importing Tweets from [tweets/data/js/tweets/2013_01.js] Importing Tweets from [tweets/data/js/tweets/2013_02.js] Imported 15027 Tweets
-
Do stuff with your data!
Stuff to do with your data
You can search for something:
▶ php twarch.php find "Java is weird"
+------------+--------------------------+------------------------------------------------------------+
| Id | Created | Text |
+------------+--------------------------+------------------------------------------------------------+
| 1023688023 | 2008-11-26T00:42:07+0000 | Done the 'hello, world' thing on the G1 now. Java is weird |
+------------+--------------------------+------------------------------------------------------------+
List all of your Tweets:
▶ php twarch.php all
+--------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| Id | Created | Text |
+--------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| 1010488898 | 2008-11-18T00:34:32+0000 | At 15 of 80gb. Data recovery sucks. |
| 1010523359 | 2008-11-18T01:02:23+0000 | "Time for bed", said Zebedee... "Piss off, you springy bastard", said Florence |
...
| 305969208901115904 | 2013-02-25T09:15:18+0000 | @ghalfacree dead easy to do. I reckon you could manage it :) |
+--------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
Get a list of words and how often you used them:
▶ php twarch.php uniquewords --min-count=500 --min-word-length=6
+-------------+-------+
| Word | Count |
+-------------+-------+
| @lingmops | 1471 |
| @ghalfacree | 862 |
| @scawp | 817 |
| @johnmclear | 621 |
+-------------+-------+
Get a total word count and other stats about your Tweets:
▶ php twarch.php stats
Tweets: 15211
Total words: 211827
Words per Tweet: 14
Total characters: 1250269
Characters per Tweet: 82
Get a list of who you've mentioned and how often:
▶ php twarch.php mentions
+--------------------------+-------+
| Handle | Count |
+--------------------------+-------+
| @lingmops | 1490 |
| @ghalfacree | 871 |
| @scawp | 822 |
| @johnmclear | 624 |
+--------------------------+-------+
See which hashtags you've used and how often:
▶ php twarch.php hashtags
+-------------------------------------------------------------------------+-------+
| Hashtag | Count |
+-------------------------------------------------------------------------+-------+
| #songsincode | 17 |
...
| #joke | 1 |
+-------------------------------------------------------------------------+-------+
Find out what time of day you Tweet the most:
▶ php twarch.php timeofday
+------+-------+
| Hour | Count |
+------+-------+
| 00 | 218 |
| 01 | 80 |
...
| 22 | 552 |
| 23 | 317 |
+------+-------+
Which day of the week you Tweet the most:
▶ php twarch.php dayofweek
+-----+-------+
| Day | Count |
+-----+-------+
| Mon | 2316 |
...
| Sun | 1333 |
+-----+-------+
Or which day of the month:
▶ php twarch.php dayofmonth
+-----+-------+
| Day | Count |
+-----+-------+
| 01 | 515 |
| 02 | 548 |
...
| 30 | 368 |
| 31 | 249 |
+-----+-------+
Or even which month of the year:
▶ php twarch.php monthofyear
+-------+-------+
| Month | Count |
+-------+-------+
| Jan | 1586 |
| Feb | 1598 |
...
| Nov | 1267 |
| Dec | 1232 |
+-------+-------+
You can see how much you've Tweeted over time (with year
, month
or day
resolution):
▶ php twarch.php trend --resolution=year
+------+-------+
| Year | Count |
+------+-------+
| 2008 | 42 |
| 2009 | 1510 |
...
| 2012 | 5617 |
| 2013 | 864 |
+------+-------+
Updating your data
You can update your data via HTTP with the sync
mode:
▶ php twarch.php sync tomnomnom
Last Tweet had ID [305359084289396736]
Importing Tweet with ID [305379759108530177]
Imported 1 Tweets
Coming soon
- Possibly a Phar version
- Other, interesting, modes to do more stuff with your data