russian-troll-tweets
russian-troll-tweets copied to clipboard
Save the community some typing
I know this is an abuse of "issues" but it doesn't warrant a full repo. Here is some Python code you can cut/paste
class Rutweet:
def __init__(self, external_author_id, author, content,
region, language, publish_date, harvested_date,
following, followers, updates, post_type, account_type,
retweet, account_category, new_june_2018):
self.external_author_id = external_author_id
self.author = author
self.content = content
self.region = region
self.language = language
self.publish_date = publish_date
self.harvested_date = harvested_date
self.following = following
self.updates = updates
self.post_type = post_type
self.account_type = account_type
self.retweet = retweet
self.account_category = account_category
self.new_june_2018 = new_june_2018
And a quick loader.
def load_tweets(fn):
with open(fn, 'r') as f:
for line in f.readlines():
fields = line.split(',')
rut = Rutweet(fields[0], fields[1], fields[2],
fields[3], fields[4], fields[5],
fields[6], fields[7], fields[8],
fields[9], fields[10], fields[11],
fields[12], fields[13], fields[14],
)
If you're using python 3 sometimes emojis will screw with unicode decoding of the text files. Do open(fn, 'r', encoding="latin-1") if you're getting a UnicodeDecodeError
If you want to make this work with my schema, I'll make you a contributor and we can develop on it instead.
- https://github.com/EvanCarroll/russian-troll-tweets
I think putting the python code in a subdirectory organized under python
would be a great idea for python users. But this code is for the older v1 dataset, not the v2 data set. I've done the same thing for PostgreSQL you can find my scripts under ./PostgreSQL
@Meeds122 see my note at the bottom of #20
https://github.com/fivethirtyeight/russian-troll-tweets/issues/20#issuecomment-416716716
@EvanCarroll Great, could you create an issue and assign it to me? I'll try to contribute this week. I have some other things I could add as well.