fulltextsearch icon indicating copy to clipboard operation
fulltextsearch copied to clipboard

image files and external files are indexed although both are set to not get indexed.

Open ferdiga opened this issue 1 year ago • 4 comments

especially indexing of images is extremely time consuming and must be excluded.

  • up to ONE minute per image and 100% CPU - hence basically unusable for databases with a lot of images. eventually only the exif information should be extracted and indexed.

BTW there is currently no option to configure and others

"files_image": "0",
"files_audio": "0",
- Content Providers:
Deck 1.13.1
[]
Files 29.0.1
{
    "files_local": "1",
    "files_external": "2",
    "files_group_folders": "1",
    "files_encrypted": "0",
    "files_federated": "0",
    "files_size": "1",
    "files_pdf": "1",
    "files_office": "1",
    "files_image": "0",
    "files_audio": "0",
    "files_chunk_size": "2",
    "files_fulltextsearch_tesseract": {
        "version": "27.0.0",
        "enabled": "1",
        "psm": "4",
        "lang": "eng,deu,fra",
        "pdf": "1",
        "pdf_limit": "0"
    }
}

ferdiga avatar Sep 13 '24 11:09 ferdiga

just to illustrate - the indexing finished after 2 1/2 month with an error.

  • not very helpful error message - have to contact Hetzner support
  • during this time no other occ command is allowed

Image

ferdiga avatar Nov 08 '24 06:11 ferdiga

Do you scan images?

Piefje01 avatar Nov 08 '24 06:11 Piefje01

apparently - yes - although the settings should exclude imaged - as shown above "files_image": "0", "files_audio": "0",

IMO EXIF Data could be scanned at low cost

ferdiga avatar Nov 08 '24 06:11 ferdiga

Some how it did not work: Ocr within a jpg for example . I have a lot of photo's with sample numbers but it will no be scanned.

Piefje01 avatar Nov 08 '24 07:11 Piefje01