russian-troll-tweets
russian-troll-tweets copied to clipboard
Should be available via BitTorrent and as a web database that can be queried
First, thanks for making the data available. I was asking about this recently. I would like to get a look at troll tweets, it might help us avoid arguing with them in the future.
However --
I wasn't able to download the file, abd this is not a great way to distribution the info. Better would be:
-
BitTorrent distribution. It was made for data like this. GitHub, not so much.
-
And it would be wonderful to have this online as a database that can be queried with SQL commands.
I would be happy to help either or both projects, assuming they don't already exist.
Thanks again for uploading the data.
Dave
I just tweeted about your second point [tweet], since I've imported the dataset into BigQuery, which has a free tier (1TB of queries). The dataset is public.
You can query the dataset like so:
SELECT author, content, followers
FROM `optimum-rock-145719.fivethirtyeight_russian_troll_tweets.russian_troll_tweets`
WHERE language = "English"
ORDER BY followers DESC
LIMIT 5
@elithrar Thank you so much for putting into BigQuery!
I had not used BigQuery before. Here's the link to the query you ran.
https://console.cloud.google.com/bigquery?_ga=2.22451449.-337486084.1533083301&pli=1&project=nimble-gearing-94719&folder&organizationId&j=bquxjob_c3fe756_164f2e2d821&page=queryresults
I’m going to put a blog post up in the next day or so that walks through how to use BigQuery to explore this dataset, including how to make the most of the free tier with good query habits.
Will link back here! On Tue, Jul 31, 2018 at 5:30 PM Dave Winer [email protected] wrote:
I had not used BigQuery before. Here's the link to the query you ran.
https://console.cloud.google.com/bigquery?_ga=2.22451449.-337486084.1533083301&pli=1&project=nimble-gearing-94719&folder&organizationId&j=bquxjob_c3fe756_164f2e2d821&page=queryresults
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fivethirtyeight/russian-troll-tweets/issues/3#issuecomment-409410355, or mute the thread https://github.com/notifications/unsubscribe-auth/AABIcBB_LRs4JxDka7c4CB6fqG_t2qjYks5uMPaigaJpZM4VolUF .
I put the tweets online here, with a search interface:
http://24ahead.com/influence-tweets
tweets can be queried here too: http://www.fromrussiawithtroll.com/
Better late than never: I've posted a guide to querying my hosted dataset using BigQuery - https://blog.questionable.services/article/diving-into-fivethirtyeight-troll-tweets-bigquery/
e.g.
SELECT
author,
COUNT(*) AS count,
FORMAT("%.2f", COUNT(*) / (
SELECT
COUNT(*)
FROM
`optimum-rock-145719.fivethirtyeight_russian_troll_tweets.russian_troll_tweets`) * 100) AS percent
FROM
`optimum-rock-145719.fivethirtyeight_russian_troll_tweets.russian_troll_tweets`
GROUP BY
author
ORDER BY
percent DESC
LIMIT
10
We've also put together a tool for querying the tweets online: https://russiatweets.com