open-semantic-search icon indicating copy to clipboard operation
open-semantic-search copied to clipboard

Stop unzipping of Documents and removing files by file-type

Open dennirockz opened this issue 3 years ago • 1 comments

Dear all, I am having some issues with the indexing of files - altough I've blacklisted application/vnd.oasis.opendocument. my word and excel sheets are still extracted and the embedded documents are linked in my search engine. Is there a way to a) stop the extraction of word / PPT etc. b) Remove all files by type [e.g. remove all *.emf files]

Thanks for your help!

grafik

dennirockz avatar Jun 02 '21 09:06 dennirockz

Hi, for that you can add the file extensions you want to skip to this file: /etc/opensemanticsearch/blacklist/blacklist-url-suffix see also: https://www.opensemanticsearch.org/doc/admin/config/blacklist

mmoossen avatar Oct 31 '21 16:10 mmoossen