cdQA icon indicating copy to clipboard operation
cdQA copied to clipboard

Pdf converter showing Error

Open TobiKoledoye opened this issue 5 years ago • 2 comments

I get the following whenever i try to use the pdf converter. "Unexpected error: <class 'AttributeError'> Unable to process file 1q19-pr-12648.pdf"

tried it using the examples, same thing.

TobiKoledoye avatar Jan 09 '20 02:01 TobiKoledoye

Hi @TobiKoledoye

The pdf converter tutorial currently works, you can try it here.

Can you share the code you used and your pdf file so we can reproduce the bug?

fmikaelian avatar Jan 16 '20 21:01 fmikaelian

@fmikaelian I am facing the same issue. I have attached the pdf file and the code i am running in my command prompt is :

and after it even converts into a data frame it is not getting converted in an proper format, can you help me in this:

JD1.pdf


>>>from cdqa.utils.converters import pdf_converter
>>> import tika
>>> df = pdf_converter(directory_path='/home/xxxx/Downloads/data/')

suresh96458 avatar Feb 19 '20 11:02 suresh96458