oldnyc
oldnyc copied to clipboard
Missing OCR text for many images
A few examples:
- 702198b — text on brown backing paper
- 706410b — brown backing with text
- 709457b — grayscale with text but no OCR
- 729236b — grayscale with text but no OCR
- 711642b — Missing text from color image
- 711564b — Missing text from color image
- 716490b — Missing text from color image
- 731966b — Missing text from gray image (why?)
- 703429b — Missing text from color image
Based on my survey, ~20% of images have text on the back that was not OCR'd.
In that list, 8/9 were missing from the NYPL's S3 bucket. 731966b
was actually the front of the image.
@riordan There were 30,413 back of the card images in the S3 bucket, but ~43,000 photos in the CSV file that Matt originally sent me. Is there any chance we could recover more of them?
I'll try.