fulltextsearch icon indicating copy to clipboard operation
fulltextsearch copied to clipboard

Hard Error/Force Quit on files with parentheses - ( )

Open codejp3 opened this issue 1 year ago • 4 comments

Ran into an issue with the FTS initial index on my files.

I had a directory of files that all ended with "(YEAR)" - opening-parenthesis 4-digit-year closing-parenthesis - as the last part of the filename (e.g - "Training Manual (2001).pdf"). Whenever it got to one of those files, the initial index service would seem to freeze. Even after letting it run for more than a day on a single file it never progressed any further and would eventually not respond. The index would eventually force quit itself.

After renaming the files without parentheses (e.g. - "Training Manual 2001.pdf"), they indexed fine.

Before renaming, there was no error saved to the index log. There was nothing indicating that the parentheses caused the error, but I totally guessed that was the issue, and guessed right.

Before renaming:

┌─ Indexing  ────
│ Action: indexDocument
│ Provider: Files                Account: USERNAME
│ Document: 3067515
│ Info: application/pdf
│ Title: Path/To/Training Manual (2001).pdf
│ Content size: 7852544
│ Chunk:   1101/1277
│ Progress:    964/6511
└──
┌─ Results ────
│ Result:    539/539
│ Index: files:3067542
│ Status: ok
│ Message: {"_index":"nc_indexnextcloud","_id":"files:3067542","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no"
│ :41204,"_primary_term":2}
│ 
└──
┌─ Errors ────
│ Error:      0/0
│ Index: 
│ Exception: 
│ Message: 
│ 
│ 
└──

## x:first result ## c/v:prec/next result ## b:last result
## f:first error ## h/j:prec/next error ## d:delete error ## l:last error
## q:quit ## p:pause 
Force Quit

Note the "Force Quit" above

occ fulltextsearch:document:status -u USERNAME files 3067515

In DocumentStatus.php line 199:
                                                               
  Specify a valid status: IGNORE, INDEX, DONE, REMOVE, FAILED  
occ fulltextsearch:document:provider -- USERNAME files 3067515

(did not respond and produced no output)

After Renaming (and indexing successfully):

occ fulltextsearch:document:status -u USERNAME -j -- files 3067515

{
    "ownerId": "USERNAME",
    "providerId": "files",
    "collection": "local",
    "source": "files_local",
    "documentId": "3067515",
    "lastIndex": 1713212033,
    "errors": [],
    "errorCount": 0,
    "status": 1,
    "options": {
        "_files_pdf": "1",
        "_files_local": "1"
    }
}

{
    "id": "3067515",
    "providerId": "files",
    "access": {
        "ownerId": "USERNAME",
        "viewerId": "",
        "users": [],
        "groups": [],
        "circles": [],
        "links": []
    },
    "modifiedTime": 1485997542,
    "title": "Path/To/Training Manual 2001.pdf",
    "link": "http:\/\/localhost\/index.php\/f\/3067515",
    "index": {
        "ownerId": "USERNAME",
        "providerId": "files",
        "collection": "",
        "source": "files_local",
        "documentId": "3067515",
        "lastIndex": 0,
        "errors": [],
        "errorCount": 0,
        "status": 28,
        "options": {
            "_files_pdf": "1",
            "_files_local": "1"
        }
    },
    "source": "files_local",
    "info": {
        "share_names": {
            "USERNAME": "Path\/To\/Training Manual 2001.pdf"
        }
    },
    "hash": "",
    "contentSize": 7852544,
    "tags": [],
    "metatags": [
        "files_local"
    ],
    "subtags": [],
    "more": {
        "creationTime": 1713211957,
        "accessedTime": 1713210132
    },
    "excerpts": [],
    "score": ""
}

codejp3 avatar Apr 16 '24 01:04 codejp3