textract icon indicating copy to clipboard operation
textract copied to clipboard

Extract text directly from file-object / file-content rather than using filename

Open jrkkfst opened this issue 4 years ago • 1 comments

Maybe this is already possible?

How would i go about to extract text from the content of a file, rather than reading the file itself? Background is that using the upload component in Dash, one gets the content of the file rather than a pointer to the file location.

Perhaps, by accessing one of the internal functions in textract and specifying an extension this is already possible?

jrkkfst avatar Dec 03 '20 13:12 jrkkfst

Unfortunately there's no way to do that, as textract launches external commands (most notably pdftotext) to process files sometimes.

traverseda avatar Dec 29 '20 20:12 traverseda