Improve imports
I am attempting to create an epic to group together import issues.
The next step for this issue is to meet as a team, put these issues in the right ranks order, and break down necessary issues onto the roadmap w/ details.
- Import Sources
- Better World Books
- [ ] #6555 (needs investigation + to be broken into sub-issues)
- Amazon
- [x] #6405
- [ ] #5333
- [ ] #7118
- [ ] #2674 (needs investigation)
- [x] #2444
- Archive.org
- [x] #7521
- Better World Books
- Batch Import API
- [x] #7705
- [ ] #7236
- [x] #7160
- Import Data Quality & Resolution
- Highest priority
- [x] #7701
- [x] #7702
- based on #7701
- [x] #7658
- should be resolved from #7521
- [ ] #5833
- [x] #756
- [ ] #2205
- Moderate priority
- [x] #7509
- [x] #7349
- [ ] #3473
- [ ] #7113
- [ ] #2304 (unclear scope)
- [ ] #667 (unclear path forward)
- [ ] #2274 (needs investigation)
- [x] #2410 (still an issue?)
- [ ] #7484
- [ ] #7350
- [x] #7661
- [x] #7264
- Highest priority
- Cleaning Existing Data
- [x] #2569
- Other
- [ ] #7539 (a linked PR will close this)
- [ ] #2435
- [ ] #2625
- [x] #2208
- [x] #7744
- [ ] #7756 (needs priority assessment)
Stakeholders
This is a draft and needs more work / input. I will edit this to tag more people once I clean it up further.
Thank you for paying attention to this! Imported metadata quality has plummeted dramatically in the last 10-12 months and it would be great to see this trend reversed.
This is really getting ridiculous. Perhaps it would be a good idea to pause non-MARC imports (ie Amazon and BWB) until things can be sorted out and the ship righted. See, for example, https://github.com/internetarchive/openlibrary/issues/6405#issuecomment-1517079017 but the total array of failure modes is dizzying.