LibraryThingidentifiers are for works, not editions
At present they are improperly attached to individual OL edition records. They should be moved, as they apply equally to all editions of a work. This could be used to help deduplicate and merge OL work records. The LT entries may also be helpful in identifying ISBNs that have been missed for a given work.
There are a number of places in the code where librarything ids are associated with editions:
https://github.com/internetarchive/openlibrary/search?utf8=%E2%9C%93&q=librarything&type=
most notably in openlibrary/plugins/openlibrary/pages/config_edition.page where it is specified as an edition identifer.
Currently there are 4.3M edition records that have librarything identifiers associated at the edition level
grep -c '"librarything":' ol_dump_editions_2017-09-30.txt
4304416
I'll do some analysis to check how many are duplicate ids..
@mekarpeles Should this also move to the client?
This still appears to be a problem. @hornc Are you willing to be assignee for this issue? Note, being the assignee doesn't necessarily mean you are responsible for doing the work, just responsible for gathering/providing information to address the issue. From the Wiki.
The assigned owner is not necessarily the person who will fix the issue (it is not necessarily even established, at that point, if or when the issue will be fixed at all), but rather they are the person who will do as much or as little as needed to handle the issue (asking questions, soliciting input, establishing and updating the priority, checking if it is a duplicate, etc).
Once an issue is labeled State: Work In Progress, the owner is the individual doing the work, or leading/coordinating the group that is doing the work.
I've added labels per context: let me know your thoughts
Now that work identifiers are implemented, LibraryThing should be moved to works and existing IDs automatically migrated.
Correct this is now no longer blocked 🥳
I believe there are a few steps to this:
- Create a batch job to move any
edition.identifiers.library_thingto the parent work atedition.identifiers.library_thing - Add "Library Thing" to work identifiers config, and remove "Library Thing" from edition identifiers config
I'm sorry if this isn't the right place to ask this, but could the work on this issue be why I'm seeing books with LT identifiers on the website but the Editions data dump does not include it?