bigbang icon indicating copy to clipboard operation
bigbang copied to clipboard

Affiliation data format and parsers

Open sbenthall opened this issue 5 years ago • 3 comments

  • Develop a portable data format for storing and transferring information about the affiliations between individuals and organizations
  • This will be a special case of a more general serializable data format that is both machine- and human- readable (e.g. YAML)
  • Deliver this format, and tooling for parsing and writing it in Python, as code in BigBang

sbenthall avatar Feb 03 '20 20:02 sbenthall

Is this the same as #352?

npdoty avatar Mar 18 '20 20:03 npdoty

What can we pull from the IETF Datatracker? Or from RFCs which list affiliations of authors/editors in the credits? Or from some other data source?

And then drop that into a dataframe or some basic format, so that it can be consumed by other code working on analysis of organizational influence/distribution within working groups.

npdoty avatar Jul 19 '21 14:07 npdoty

See #25 for discussion of this.

DataTracker attendance data is a nice source for affilation and nationality data for IETF members.

An actionable way to approach this might be:

  • Develop the affiliation data format and corresponding object, which tracks individuals, their email addresses (with times/duration), their affiliations (with times/duration), their nationalities (with times/duration) (where duration is inferred from time point data).
  • Have a script for population such an object from IETF Attendance data as an example of how to do it.
  • demonstrate how this can be used to plot affiliate interactions in a notebook as per @Christovis 's visualizatiosn from IAB-AID-1

Ultimately, such an object could wrap an ORM with an actual database behind it, and/or work with the entity resolution code to better resolve (its own) organization/affiliation references.

sbenthall avatar Dec 03 '21 15:12 sbenthall