fromthepage
fromthepage copied to clipboard
add pdfs and zip file info to upload documentation
Currently we allow users to upload zip files of images, PDFs, or zip files of PDFs. This is not covered in sufficient detail in our documentation. In particular, the various options for importing text from (text layers of PDFs, matching txt files, matching xml files).
Existing documentation: https://content.fromthepage.com/project-owner-documentation/
Existing video walking through upload options: https://www.youtube.com/watch?v=UcNXSY0q9uE
Page Image Guidelines PNG, GIF, and JPG files are all acceptable.
Images should be oriented so that they are right-side-up.
Images should be named so that an alphabetical sort will result in the correct page order. (This may require "zero-padding" for any page numbers: page_09.jpg, page_10.jpg will sort correctly, but page_9.jpg, page_10.jpg will not.)
Rudimentary documentation covering importing txt: https://content.fromthepage.com/project-owner-documentation/uploading-existing-transcriptions/ (Does not mention XML or PDF text)
FTP_Uploading_Transcriptions.docx
Updated text for Uploading existing transcriptions or OCR with Page Images. I chose to go for a two-column table to illustrate image/transcript pairing.
Example for CDM import: https://collections.digitalmaryland.org/digital/collection/mcw/id/355/rec/47
IA Example: https://archive.org/details/Jeremiah_White_Graves_Diary_Volume_1_Part_4
Current Draft (3 Docs)
https://docs.google.com/document/d/1KDh7na1XLoVIFaeP2EOY8Pld4sWePKjF8XhNMhn--U8/edit#heading=h.q1nhze60n5a9
Image Files- https://docs.google.com/document/d/1KDh7na1XLoVIFaeP2EOY8Pld4sWePKjF8XhNMhn--U8/edit?usp=sharing
PDFs- https://docs.google.com/document/d/1iE3xziMIXfMOAnQXBotGnPa45Dpbc9S2RHC4yAwmtEs/edit?usp=sharing
@masonacjones if you could review and fix the comments on these documents, I think we can publish them on our documentation site, which would be great.