congress-legislators icon indicating copy to clipboard operation
congress-legislators copied to clipboard

Social media urls to add to your legislators data set

Open arderyp opened this issue 9 years ago • 16 comments

I work with the Library of Congress, and I am using your 'legislators-current' and 'legislators-social-media' yaml files to facilitate scoping for our US Congress web archive collection. Thanks for the data!

As it turns out, you had a good chunk of social media sites that we were missing. However, we also seem to have a descent amount of data that you are missing. I have created the following json file outlining the social media accounts that we have found, but are missing from your data set, per Bioguide ID. @dwillis informed me that some of it may not be pertinent to the list you keep, but I thought I might as well pass it along:

https://gist.github.com/pardery/84fc56c836c4b1f02708

Thanks, Phil

arderyp avatar May 28 '15 16:05 arderyp

Re: "official". We only scope in sites that are linked from the congressional website as part of their social media framework. Where does that come from for your set? (Phil and I work together)

gmj2053 avatar May 28 '15 17:05 gmj2053

So if these are all linked to from their congressional websites, that's often a strong enough signal to accept those handles, but review is needed.

There are a few that we exclude anyway, when they're not actually for the member, like: http://www.flickr.com/photos/republicanconference/ or http://www.youtube.com/HouseConference. Those are often used as placeholders until the MoC gets their own account.

Also, not all of the Twitter accounts are legit. http://twitter.com/SenRandPaul is listed, but that account doesn't exist. http://twitter.com/alangrayson is also listed, but that doesn't look legit, and http://grayson.house.gov doesn't link to it (or to any Twitter account). https://twitter.com/sethmoulton works, and is linked to by his official site, but the Twitter bio links to Moulton's campaign site, which is a mistake on someone's part and potentially opens up Moulton to charges that he's using official resources to benefit his political campaign. I'd at least want to call Moulton's office to confirm that the account is meant to be his official legislative account.

We also haven't (yet) been tracking Flickr accounts. It looks like you have 304 listed in that gist (not counting the /republicanconference/ account). That's more widespread than I thought -- perhaps the project should consider tracking Flickr accounts too.

We don't track G+ accounts, and there are only 33 plus.google.com URLs in that gist -- plus G+ is a wasteland -- so that doesn't seem worth tracking. We're also not likely to incorporate a Picasa account (e.g. http://picasaweb.google.com/congresswomanpingree/)

In any case, it looks like there are still some accounts we're missing in your gist, and I'd like to review those. Could you exclude the G+, Picasa, Flickr, and placeholder accounts, and post a link to the updated list?

konklone avatar May 31 '15 23:05 konklone

Our data definitely needs cleanup. Looks like the Sen Paul (2014) and Alan Grayson (2010) had a link to twitter from their congressional sites. One issue is our data is cumulative and we do have data from other older sources (govtrack/wikipedia) so it would be great to have a reliable source that when we make these archives public and expose this scoping, that we have good data.

@pardery can make the data look anyway you want. It could be we look at flagging our data for the accounts you care about (youtube, twitter, facebook, ?) and clean up bad/old data.

We really appreciate this work.

gmj2053 avatar Jun 01 '15 11:06 gmj2053

Thanks for the input @konklone. I moved the original list here and the new list excluding picassa, google plus, and flickr is in place here

arderyp avatar Jun 01 '15 12:06 arderyp

Thanks, @pardery! Sorry to leave you hanging on this -- I can review a bit later this week, but if anyone else on the project wants to jump on it before then, by all means.

konklone avatar Jun 08 '15 06:06 konklone

Thanks!

arderyp avatar Jun 08 '15 13:06 arderyp

OK, finally got around to this. Some updates below, more to come as I keep working. I have a social-media-updates branch with work in progress incorporating these and a new automated sweep.

I removed the LinkedIn, Pinterest, and Tumblr accounts from the provided file. I removed any accounts that pointed to leadership/conference accounts.

Twitter:

  • https://twitter.com/repjustinamash - A private closed account.
  • https://twitter.com/SenRandPaul - Nonexistent.
  • Other Twitter accounts in this set were caught on a subsequent scan, after fixing a bug where we weren't picking up protocol-relative URLs. (There's one on walker.house.gov.)
  • https://twitter.com/sethmoulton appears on https://moulton.house.gov, but it links to his campaign account. https://twitter.com/teammoulton is his office account, and is verified by Twitter and RTed by the @sethmoulton account, but is impersonal and not linked to from the office page. Rep. Moulton doesn't seem to have his act in order. I'm putting the @teammoulton account in as the official account, but open to suggestions of other ways to handle it (including contacting Moulton's office).

konklone avatar Jul 11 '15 22:07 konklone

Oh, missed some other Twitter submissions from the list:

  • http://twitter.com/DonNorcrossNJ1 was renamed, appears in the new sweep.
  • http://twitter.com/alangrayson is Grayson's campaign account.

Facebook:

  • http://www.facebook.com/billnelson - The senator's campaign account.
  • http://www.facebook.com/wyden - The senator's campaign account.
  • http://www.facebook.com/CongressmanGuthrie - Doesn't link to FB from his homepage, will need a call to the MoC's office.
  • http://www.facebook.com/RepJohnSarbanes - Doesn't exist.
  • http://www.facebook.com/danarohrabacher - The Rep's campaign account.
  • http://www.facebook.com/reppeteking - The Rep's campaign account.
  • The other Facebook accounts in this set were caught on a subsequent scan.

konklone avatar Jul 11 '15 22:07 konklone

I filed #299 with my work so far. I posted https://gist.github.com/konklone/d531370dadd55d31eb0c, which has the remaining instagram and youtube accounts you submitted.

konklone avatar Jul 11 '15 22:07 konklone

Seth Moulton's team resolved the bad link in favor of the TeamMoulton account: https://twitter.com/sethmoulton/status/620603762751733760

konklone avatar Jul 13 '15 14:07 konklone

I'm curious about how websites are identified for this project.

We typically use the domain and not the webserver redirect since that may change during a congress. I am seeing websites such as https://crenshaw.house.gov/index.cfm/home listed now, vice http://crenshaw.house.gov/

Keep up the good work! we are using the project data and value it highly.

gmj2053 avatar Jul 16 '15 12:07 gmj2053

We typically use the domain and not the webserver redirect since that may change during a congress. I am seeing websites such as https://crenshaw.house.gov/index.cfm/home listed now, vice http://crenshaw.house.gov/

I share your opinion that we should be using the domain, not the specific homepage URL, as those definitely do change. The House and Senate have both stabilized on a subdomain pattern in recent years, too. I thought we were using the domain now.

I did some checking of our current legislators file (by hand) and it looks like that for legislator's current term, we mostly are. Crenshaw is an exception, and we should look at our script again to see why that's the case. Can we move it to another ticket?

konklone avatar Jul 16 '15 14:07 konklone

Was that a question for me, I can move it. I put it on this ticket because I wasn't sure if wasn't just one of our peculiarities. There are about three on the list.

gmj2053 avatar Jul 16 '15 16:07 gmj2053

Got it, I've opened #300 to reflect that work.

konklone avatar Jul 16 '15 16:07 konklone

I was working on a project using legislators' social media accounts and pulled some of the data in this repo. I found some new Twitter accounts for both new and returning MoCs, as well as other accounts that needed cleaning up. I looked through the guidelines for what you consider "official," so I've included information on whether these accounts are linked on their websites so you can add/edit handles at your discretion.

Twitter handles to add to repo:

id twitter account linked on website
R000608 repjackyrosen no
C001110 reploucorrea no
B001298 repdonbacon no
S001199 RepSmucker no
M001198 RepMarshall no
S001202 SenatorStrange no website
R000609 RepRutherfordFL yes
J000299 repmikejohnson yes
E000296 RepDwightEvans yes

Twitter handles to clean up in repo:

id twitter account linked on website
B000574 repblumenauer no
G000535 RepGutierrez yes
T000462 pattiberi yes
Y000064 SenToddYoung yes
C001088 ChrisCoons yes

soooh avatar Feb 13 '17 19:02 soooh

in #434, I added all the linked ones and the below ones as well.

id twitter account linked on website
R000608 repjackyrosen looks offical
C001110 reploucorrea looks official
B001298 repdonbacon looks official
S001199 RepSmucker looks official
M001198 RepMarshall looks official
S001202 SenatorStrange looks official

leaving this one out since it looks strange..

id twitter account linked on website
B000574 repblumenauer looks offical... but links to campaign site? holding off on this one

joelcollinsdc avatar Feb 13 '17 22:02 joelcollinsdc