acl-anthology
acl-anthology copied to clipboard
Backfill author metadata
trafficstars
We have been using aclpub2 for ingestion since last year. The YAML data they send us has author metadata. We should ingest the author affiliation, and then use this information of #2128 and other issues. This would be a simple one-time script.
#2283 handles the author affiliation ingestion from aclpub2 format, and all future ingestions from this format.
Questions:
- Do we want to backfill all xml files with author affiliation by creating the aforementioned one-time script?
- How do we want to handle future non-aclpub2 format ingestion?
Wherever we have easy access to the information, we should include it.