openlibrary icon indicating copy to clipboard operation
openlibrary copied to clipboard

Work dumps include authors without a key

Open jimman2003 opened this issue 1 year ago • 3 comments

Works in the works dump have authors without a key, that are not present in the api response.

Evidence / Screenshot (if possible)

Authors of OL24527121W in the work dump: "authors": [{"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239668A"}, "type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239669A"}, "type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239670A"}, "type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239671A"}, "type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239672A"}, "type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239673A"}, "type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}, {"type": {"key": "/type/author_role"}}] api response: {"title": "Medicina: Aspectos Epidemiol\u00f3gicos, Cl\u00ednicos e Estrat\u00e9gicos de Tratamento", "authors": [{"author": {"key": "/authors/OL9239668A"}, "type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239669A"}, "type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239670A"}, "type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239671A"}, "type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239672A"}, "type": {"key": "/type/author_role"}}, {"author": {"key": "/authors/OL9239673A"}, "type": {"key": "/type/author_role"}}], "key": "/works/OL24527121W", "type": {"key": "/type/work"}, "latest_revision": 1, "revision": 1, "created": {"type": "/type/datetime", "value": "2021-05-25T16:10:52.471943"}, "last_modified": {"type": "/type/datetime", "value": "2021-05-25T16:10:52.471943"}}

Relevant url?

Work API

Steps to Reproduce

  1. Read thourgh the work dump
  2. Find that work in the work dump
  3. Observe that authors array
  4. Compare to the api response

Related files

script/oldump.sh script/oldump.py https://github.com/internetarchive/openlibrary/blob/ceca4ccca599ba4d2660d687937cdd85fc8b9a08/openlibrary/data/dump.py#L328

Stakeholders

@cdrini

jimman2003 avatar Feb 22 '24 17:02 jimman2003

Hmm 🤔 @RayBB would you by any chance be able to verify if the latest dumps still has this issue?

cdrini avatar Sep 23 '24 15:09 cdrini

Still there in the July dump, which is the latest I have handy.

tfmorris avatar Sep 23 '24 19:09 tfmorris

Ah ok, thank you @tfmorris so seems like still an issue.

cdrini avatar Sep 23 '24 19:09 cdrini