congress-legislators
congress-legislators copied to clipboard
Social media urls to add to your legislators data set
I work with the Library of Congress, and I am using your 'legislators-current' and 'legislators-social-media' yaml files to facilitate scoping for our US Congress web archive collection. Thanks for the data!
As it turns out, you had a good chunk of social media sites that we were missing. However, we also seem to have a descent amount of data that you are missing. I have created the following json file outlining the social media accounts that we have found, but are missing from your data set, per Bioguide ID. @dwillis informed me that some of it may not be pertinent to the list you keep, but I thought I might as well pass it along:
https://gist.github.com/pardery/84fc56c836c4b1f02708
Thanks, Phil
Re: "official". We only scope in sites that are linked from the congressional website as part of their social media framework. Where does that come from for your set? (Phil and I work together)
So if these are all linked to from their congressional websites, that's often a strong enough signal to accept those handles, but review is needed.
There are a few that we exclude anyway, when they're not actually for the member, like: http://www.flickr.com/photos/republicanconference/
or http://www.youtube.com/HouseConference
. Those are often used as placeholders until the MoC gets their own account.
Also, not all of the Twitter accounts are legit. http://twitter.com/SenRandPaul
is listed, but that account doesn't exist. http://twitter.com/alangrayson
is also listed, but that doesn't look legit, and http://grayson.house.gov doesn't link to it (or to any Twitter account). https://twitter.com/sethmoulton
works, and is linked to by his official site, but the Twitter bio links to Moulton's campaign site, which is a mistake on someone's part and potentially opens up Moulton to charges that he's using official resources to benefit his political campaign. I'd at least want to call Moulton's office to confirm that the account is meant to be his official legislative account.
We also haven't (yet) been tracking Flickr accounts. It looks like you have 304 listed in that gist (not counting the /republicanconference/
account). That's more widespread than I thought -- perhaps the project should consider tracking Flickr accounts too.
We don't track G+ accounts, and there are only 33 plus.google.com
URLs in that gist -- plus G+ is a wasteland -- so that doesn't seem worth tracking. We're also not likely to incorporate a Picasa account (e.g. http://picasaweb.google.com/congresswomanpingree/)
In any case, it looks like there are still some accounts we're missing in your gist, and I'd like to review those. Could you exclude the G+, Picasa, Flickr, and placeholder accounts, and post a link to the updated list?
Our data definitely needs cleanup. Looks like the Sen Paul (2014) and Alan Grayson (2010) had a link to twitter from their congressional sites. One issue is our data is cumulative and we do have data from other older sources (govtrack/wikipedia) so it would be great to have a reliable source that when we make these archives public and expose this scoping, that we have good data.
@pardery can make the data look anyway you want. It could be we look at flagging our data for the accounts you care about (youtube, twitter, facebook, ?) and clean up bad/old data.
We really appreciate this work.
Thanks for the input @konklone. I moved the original list here and the new list excluding picassa, google plus, and flickr is in place here
Thanks, @pardery! Sorry to leave you hanging on this -- I can review a bit later this week, but if anyone else on the project wants to jump on it before then, by all means.
Thanks!
OK, finally got around to this. Some updates below, more to come as I keep working. I have a social-media-updates
branch with work in progress incorporating these and a new automated sweep.
I removed the LinkedIn, Pinterest, and Tumblr accounts from the provided file. I removed any accounts that pointed to leadership/conference accounts.
Twitter:
- https://twitter.com/repjustinamash - A private closed account.
- https://twitter.com/SenRandPaul - Nonexistent.
- Other Twitter accounts in this set were caught on a subsequent scan, after fixing a bug where we weren't picking up protocol-relative URLs. (There's one on walker.house.gov.)
- https://twitter.com/sethmoulton appears on https://moulton.house.gov, but it links to his campaign account. https://twitter.com/teammoulton is his office account, and is verified by Twitter and RTed by the
@sethmoulton
account, but is impersonal and not linked to from the office page. Rep. Moulton doesn't seem to have his act in order. I'm putting the@teammoulton
account in as the official account, but open to suggestions of other ways to handle it (including contacting Moulton's office).
Oh, missed some other Twitter submissions from the list:
- http://twitter.com/DonNorcrossNJ1 was renamed, appears in the new sweep.
- http://twitter.com/alangrayson is Grayson's campaign account.
Facebook:
- http://www.facebook.com/billnelson - The senator's campaign account.
- http://www.facebook.com/wyden - The senator's campaign account.
- http://www.facebook.com/CongressmanGuthrie - Doesn't link to FB from his homepage, will need a call to the MoC's office.
- http://www.facebook.com/RepJohnSarbanes - Doesn't exist.
- http://www.facebook.com/danarohrabacher - The Rep's campaign account.
- http://www.facebook.com/reppeteking - The Rep's campaign account.
- The other Facebook accounts in this set were caught on a subsequent scan.
I filed #299 with my work so far. I posted https://gist.github.com/konklone/d531370dadd55d31eb0c, which has the remaining instagram and youtube accounts you submitted.
Seth Moulton's team resolved the bad link in favor of the TeamMoulton account: https://twitter.com/sethmoulton/status/620603762751733760
I'm curious about how websites are identified for this project.
We typically use the domain and not the webserver redirect since that may change during a congress. I am seeing websites such as https://crenshaw.house.gov/index.cfm/home listed now, vice http://crenshaw.house.gov/
Keep up the good work! we are using the project data and value it highly.
We typically use the domain and not the webserver redirect since that may change during a congress. I am seeing websites such as https://crenshaw.house.gov/index.cfm/home listed now, vice http://crenshaw.house.gov/
I share your opinion that we should be using the domain, not the specific homepage URL, as those definitely do change. The House and Senate have both stabilized on a subdomain pattern in recent years, too. I thought we were using the domain now.
I did some checking of our current legislators file (by hand) and it looks like that for legislator's current term, we mostly are. Crenshaw is an exception, and we should look at our script again to see why that's the case. Can we move it to another ticket?
Was that a question for me, I can move it. I put it on this ticket because I wasn't sure if wasn't just one of our peculiarities. There are about three on the list.
Got it, I've opened #300 to reflect that work.
I was working on a project using legislators' social media accounts and pulled some of the data in this repo. I found some new Twitter accounts for both new and returning MoCs, as well as other accounts that needed cleaning up. I looked through the guidelines for what you consider "official," so I've included information on whether these accounts are linked on their websites so you can add/edit handles at your discretion.
Twitter handles to add to repo:
id | twitter account | linked on website |
---|---|---|
R000608 | repjackyrosen | no |
C001110 | reploucorrea | no |
B001298 | repdonbacon | no |
S001199 | RepSmucker | no |
M001198 | RepMarshall | no |
S001202 | SenatorStrange | no website |
R000609 | RepRutherfordFL | yes |
J000299 | repmikejohnson | yes |
E000296 | RepDwightEvans | yes |
Twitter handles to clean up in repo:
id | twitter account | linked on website |
---|---|---|
B000574 | repblumenauer | no |
G000535 | RepGutierrez | yes |
T000462 | pattiberi | yes |
Y000064 | SenToddYoung | yes |
C001088 | ChrisCoons | yes |
in #434, I added all the linked ones and the below ones as well.
id | twitter account | linked on website |
---|---|---|
R000608 | repjackyrosen | looks offical |
C001110 | reploucorrea | looks official |
B001298 | repdonbacon | looks official |
S001199 | RepSmucker | looks official |
M001198 | RepMarshall | looks official |
S001202 | SenatorStrange | looks official |
leaving this one out since it looks strange..
id | twitter account | linked on website |
---|---|---|
B000574 | repblumenauer | looks offical... but links to campaign site? holding off on this one |