open-semantic-search
open-semantic-search copied to clipboard
Stop unzipping of Documents and removing files by file-type
Dear all, I am having some issues with the indexing of files - altough I've blacklisted application/vnd.oasis.opendocument. my word and excel sheets are still extracted and the embedded documents are linked in my search engine. Is there a way to a) stop the extraction of word / PPT etc. b) Remove all files by type [e.g. remove all *.emf files]
Thanks for your help!
Hi, for that you can add the file extensions you want to skip to this file: /etc/opensemanticsearch/blacklist/blacklist-url-suffix see also: https://www.opensemanticsearch.org/doc/admin/config/blacklist