sist2 icon indicating copy to clipboard operation
sist2 copied to clipboard

Scan task once halted never continues

Open MarioSob opened this issue 4 months ago • 0 comments

Device Information (please complete the following information):

  • OS: Linux x86 64bit, Ubuntu Server, Docker
  • Deployment: Docker compose
  • SIST2 Version: 3.4.1
  • Elasticsearch Version (if relevant) : 7.17.18

Describe the bug Scan never continues once previous scan has been interrupted by closing the program. Next scan instead of continuing on files that previous scan didn't managed to complete, only operates on newly detected files. Clicking on index doesn't help. Clicking full reindex helps although rescans all files over again.

Steps To Reproduce

  1. Run docker sist2, configure ES7 backed in sist-admin, add job with OCR of images and ebooks, add schedule, create frontend.
  2. Start indexing.
  3. In admin open tasks window to see scan happening. CPU usage is high and logs indicate OCR happening.
  4. After few files scanned and still many ahead shutdown docker gently.
  5. Start the application (container) again.
  6. In admin task is not running, it looks completed.
  7. Next auto-indexing doesn't reach for files that weren't scanned before. Frontend never lets you search those files content.

Expected behavior Next scan task "sees" files that weren't scanned and OCR them. It should go back to first file that weren't scanned by the last scan task.

Actual Behavior Next scan task ignores actual status of the files.

Additional context Files were mixture of jpegs and pdfs (with text and sometimes also/only images inside). Around 100 files in total in directory and subdirectories.

MarioSob avatar Feb 29 '24 21:02 MarioSob