pidgezero_one

Results 36 comments of pidgezero_one

@Freso Regarding #9448, the implementation I have here is a fairly simple change to the existing import logic. Currently, the import pipeline decides if an author is a match by...

@Freso Sounds good! I'd be happy to hop in a call with you sometime in the near future as long as we can find a time between our time zones...

> There are significant risks to doing that because OpenLibrary contains (or did until it went offline) a large number of conflated author records with more being created at an...

> Also, before starting to use author identifiers, it would make sense to make sure as many matching, non-conflicting, identifiers as possible are imported from Wikidata. It contains a large...

Moved all import API code changes to https://github.com/internetarchive/openlibrary/pull/10092, which also consolidates Wikidata identifiers with the identifiers we already have stored in OL.

> @pidgezero-one I tried to use the Wikidata query here to figure out how many items in Wikidata have (archive.org ids OR openlibrary ids) AND Wikisource pages, but I can't...

> * it's not completely clear to me whether the `en:George_Bernard_Shaw` id is really a portable identifier or a really a URL equivalent (wiki + page title, which can change),...

One more note to add! In addition to moving WS parser libraries into a standalone file to only be used during the import record generation process, I also moved the...

(I can't assign labels but this is likely a good first issue candidate)

I'll take this one on after our discussion earlier, @cdrini ! Thank you!