pandora icon indicating copy to clipboard operation
pandora copied to clipboard

Unable to generate preview

Open xaxanod opened this issue 3 years ago • 10 comments

Hi!

I'm actually trying Pandora on a Ubuntu 22.04 VM, everything works fine, excepts when i'm analyzing a .docx or .xlsx file, the preview give me this error:

Unable to generate preview: Unsupported URL file:///opt/pandora/tasks/2022/07/27d58688-d7a2-4856-91e4-9c7bfc909381/filename.docx: "type detection failed" ./framework/source/loadenv/loadenv.cxx:189

Tell me if you need any further information, thanks for your help.

xaxanod avatar Jul 01 '22 08:07 xaxanod

Libreoffice is a bit touchy and it sometimes fails to convert. Which libreoffice packages do you use?

It sometimes works better if you use the PPA: https://www.ubuntuupdates.org/ppa/libreoffice

Rafiot avatar Jul 01 '22 09:07 Rafiot

Thanks it's OK now with the new PPA.

xaxanod avatar Jul 04 '22 14:07 xaxanod

alright, I'll update the doc to recommend using that.

Rafiot avatar Jul 04 '22 14:07 Rafiot

Hello, I had deployed the latest version of Pandora in a docker I was trying to scan a text file (.txt) which contains some extracts of the "Lorem Ipsum" text and I got this error from the docker-compose logs:

pandora    | 2022-09-06 14:34:02,110 unoserver INFO:Starting unoconverter.
pandora    | 2022-09-06 14:34:02,845 unoserver INFO:Opening /pandora/tasks/2022/09/e31b451d-9c8b-45e2-86a3-cb5c50e83317/eicar.com.txt
pandora    | 2022-09-06 14:34:06,717 unoserver INFO:Exporting to /pandora/tasks/2022/09/e31b451d-9c8b-45e2-86a3-cb5c50e83317/eicar.com.txt.pdf
pandora    | 2022-09-06 14:34:06,718 unoserver INFO:Using writer_pdf_Export export filter
pandora    | 2022-09-06 14:34:06,781 preview ERROR:SfxBaseModel::impl_store <file:///pandora/tasks/2022/09/e31b451d-9c8b-45e2-86a3-cb5c50e83317/eicar.com.txt.pdf> failed: 0xc10(Error Area:Io Class:Write Code:16) at ./sfx2/source/doc/sfxbasemodel.cxx:3207 at ./sfx2/source/doc/sfxbasemodel.cxx:1783

pandora    | Traceback (most recent call last):

pandora    |   File "/pandora/pandora/workers/preview.py", line 13, in analyse
pandora    |     task.file.convert()

pandora    |   File "/pandora/pandora/file.py", line 265, in convert
pandora    |     office_to_pdf(self.path, f'{self.path}.pdf')

pandora    |   File "/pandora/pandora/file.py", line 66, in office_to_pdf
pandora    |     converter.convert(source, outpath=dest)

pandora    |   File "/root/.cache/pypoetry/virtualenvs/pandora-DsNW_kSp-py3.10/lib/python3.10/site-packages/unoserver/converter.py", line 201, in convert
pandora    |     document.storeToURL(export_path, output_props)

pandora    | unoserver.converter.com.sun.star.io.IOException: SfxBaseModel::impl_store <file:///pandora/tasks/2022/09/e31b451d-9c8b-45e2-86a3-cb5c50e83317/eicar.com.txt.pdf> failed: 0xc10(Error Area:Io Class:Write Code:16) at ./sfx2/source/doc/sfxbasemodel.cxx:3207 at ./sfx2/source/doc/sfxbasemodel.cxx:1783
pandora    | 2022-09-06 14:34:06,782 preview WARNING:Unable to generate preview, this is suspicious: SfxBaseModel::impl_store <file:///pandora/tasks/2022/09/e31b451d-9c8b-45e2-86a3-cb5c50e83317/eicar.com.txt.pdf> failed: 0xc10(Error Area:Io Class:Write Code:16) at ./sfx2/source/doc/sfxbasemodel.cxx:3207 at ./sfx2/source/doc/sfxbasemodel.cxx:1783

Do you have an idea about the problem ? I did not encounter this error with the previous version...

Have a nice day!

bigboi314 avatar Sep 09 '22 07:09 bigboi314

That's libreoffice barfing. Are you using the PPA? And if yes, can you try restarting pandora, and/or restarting the server (it might be something with the cache)?

Rafiot avatar Sep 09 '22 08:09 Rafiot

Yes, I installed the PPA and restarted Pandora but, I got the same error... Do you think it could be from somewhere else ?

bigboi314 avatar Sep 09 '22 08:09 bigboi314

The error is SfxBaseModel::impl_store <file:///pandora/tasks/2022/09/e31b451d-9c8b-45e2-86a3-cb5c50e83317/eicar.com.txt.pdf> failed: 0xc10(Error Area:Io Class:Write Code:16) at ./sfx2/source/doc/sfxbasemodel.cxx:3207 at ./sfx2/source/doc/sfxbasemodel.cxx:1783

so it is definitely something in libreoffice that barfs, but I have no idea how to debug that in any way. Do you have the issue with other files? Or just this one?

Rafiot avatar Sep 09 '22 08:09 Rafiot

I am trying with other files but somehow, the error is random: I mean for this file (.txt) the scan fails but, for another file (.txt) it succeeds... Maybe I miss something in my configuration... I will try to install Pandora without docker I have another question: can you explain to me how the "workers" work ?

bigboi314 avatar Sep 09 '22 15:09 bigboi314

yeah, LibreOffice failing randomly sounds like a thing is does, I don't think it's a config issue.

You'll need to give me more details on what you want to know regarding the workers. A good starting point is to go look at the existing workers: https://github.com/pandora-analysis/pandora/tree/main/pandora/workers

Rafiot avatar Sep 09 '22 15:09 Rafiot

Hello,

Since the last time, I manage to deploy Pandora and in fact, I did not have previews problems I also noticed that these problems are linked to the resources of my machines: I tested in a VM with 2 CPUs and 2 GB RAM, when the CPUs are overcharged, the preview is not possible and Pandora generates an "Alert" about it and thus gave me a "false" indication But once I doubled the resources, it works better

With docker, I had to modify your Dockerfile and docker-compose to reproduce the same steps indicated in the ReadMe Here also, I did not have problems with the preview but again if my machine does not have enough resources, Pandora generates the "Alert" about the preview

So I think, the problem was mainly about the performance of my machine: if the CPUs are overloaded, Pandora can't generate the preview Do you think it is coherent or not ?

And about the worker, would you mind to keep talking about it here or in another topic or anywhere else ?

bigboi314 avatar Sep 14 '22 12:09 bigboi314