dmi-tcat
dmi-tcat copied to clipboard
Get user's network.
Hi,
first of all, let me thank you again for such a great software and keeping it alive (some quotes to it would appear in a journal about Political Communication and in an other major election analysis book).
However, there are some features I miss and which I think would be a great contribution for many scholars. Obtaining the network (followers of followers, also known as 1.5 network) for certain users would be a great asset for the Gephi usage and some theorical frames (communication bubbles, Sunstein, 2006).
The aim is to obtain a full network of followers for certain users. A typical use is obtaining the followers from certain political profiles (leaders, parties, media, institutions, etc...) in order to construct possible audiences.
At the moment, the only software that is capable of this is NodeXL (which I've been using for a long time) but it doesn't satisfy all other research needs (which TCAT does) and doesn't do it very well too (to much noise you can't get of the network when exporting to Gephi, i.e.: friends, in_reply_to, etc...).
I can imagine other features that would be also very nice:
- Giving a list of tweet_ids, get the Retweet network (followers of retweeters). Use: Which reach had that retweet in the real network and reconstruct it with a dynamic perspective.
- Since the retrieval process of followers id's would take the API to the rate limit very fast, It would be interesting to have two options: 1) full network and 2) a sample of the network (how to sample twitter user would be a other pair of shoes). I.e.: When tracking down the #hashtag down, add the option: "get network of followers" and then this two options.
I can contribute to this with the following script which gets you all followers for a giving user (Python, Tweepy):
from __future__ import absolute_import, print_function
import tweepy
import time
consumer_key="xxxxxxxxxxxxxxxxx"
consumer_secret="xxxxxxxxxxxxxxxxx"
access_token="xxxxxxxxxxxxxxxxx"
access_token_secret="xxxxxxxxxxxxxxxxx"
accountvar = input("Account name: ")
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.secure = True
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
users = tweepy.Cursor(api.followers, screen_name=accountvar).items()
while True:
try:
user = next(users)
except tweepy.TweepError:
time.sleep(60*15)
user = next(users)
except StopIteration:
break
print ("@" + user.screen_name)
Hi @fubespu
we could make scripts for both follow and followed by relations. This would involve quite some steps though.
In the backend
- make a script that retrieves friend lists. Notice that 'followed by' for celebrities can be huge (millions of relations), while there are very strict API limits on follow relations with only 15 requests of max 200 persons per 15 minutes per API key.
- make a 'relation' table
- ponder the usage of profile information (which is currently stored in the db as part of a tweet object iirc)
In the frontend
- recognize whether a data set consists of relations and hide tweet based exports such as all the statistics etcetera.
- provide a new export of directed 'relation' graphs (can be huge and very memory consuming)
Currently this does not have high priority. Comments with suggestions and pull requests are encouraged though.