indexing pdf content
Hi I have problem with indexing pdf files. It's seams that mime type is not recognized, because content of pdf file is not extracted. It just store file context like '%PDF-1.4 %�쏢 5 0 obj <> stream x��}K���nxf|��/� ....
Same results with xls, doc files
Could you help me please ? Thank you
Is there the file on internet? I'd like to reproduce the problem.
Hi Yes, file is on internet for public access. http://www.csas.cz/static_internet/cs/Komunikace/Interni_komunikace/Informacni_kniha/Prilohy/TOP_Business_sdeleni_klientum.pdf But i think the problem is not in file. Did i undestand correctly, that river-web is indexing content of pdf directly or should i uses attachment plug-in ?