house_expenditures
house_expenditures copied to clipboard
Standardize Payee Names (particularly for individuals)
A key goal for us is to identify as best we can congressional staffers over time. Since they often move from office to office, we need a standardized version of their names to help with that. The problem is that there is no real unique ID for them, so we've only got a little bit of context to help us. But: it's rare for staffers who work in lawmakers' offices to work for a member of a different party, so in most cases we can assume that the combination of name-office-party would be unique(ish). The date context (each report covers a quarter) also could help with that.
My ideal result from this is a set of canonical (or close to it) names along with offices and dates. You can add party information for those records that have a bioguide_id
value via the ProPublica Congress API or the United States organization on GitHub
Hey!
I've been getting my feet wet working on this issue.
I've submitted a pull request with some preliminary work.
Currently, I'm focusing on congressional staffers. I found them by restricting the house exp. data to only be personal compensation. I was wondering if this was a satisfactory enough approach?
Additionally, nearly ~30% of the house exp. data don't have a bioID. At a cursory glance, an example of an office with no bioID is house of the speaker. I was thinking maybe we can check the date of this entry and populate a party accordingly.
I'm open to all suggestions!
Hi @vickitran - thanks! I think this makes sense to add, and staffers are a big emphasis for us. If we can standardize them it would provide us with ways to learn about turnover in offices & also to connect them to both lawmakers & lobbying efforts.
In terms of records without a bioguide_id
, most of them are offices like the Speaker and committees. I think you could make a case that we should add IDs for leadership offices but committees and other House organizations shouldn't have them. Does that make sense?
Hey @dwillis,
Yes! Thanks for the explanation regarding committees. The system can be quite confusing.
I've been playing with the data bit, since I'm really interested in visualizing staffers across time. Whether or not their positions get more prestigious. Etc.
Here's my nb so far.
I found a similar site that documents a staffer's employment history.
I'm thinking of some visualizations to do now, but that's not exactly my forte. I'll post in the slack channel to see if anyone has any suggestions!
Also, one particular column is a bit misleading.
In my nb, there are some example where an entry is from [q1 - 2010] but the matching congress meeting number is 115.
This is due to the fact that the meeting mapping is the most recent data. This rep. is currently serving, so his/her corresponding congress ID is the most recent.
There are cases where an expenditure from a previous quarter shows up in a later quarter's report. Usually these are adjustments to earlier transactions. But we can either discard such things for congress-to-congress comparisons or I can fix the original data.