fulltextsearch:live seems to miss a lot of documents in CIFS mounted directory
NC 17.0.2 Full text search - Elasticsearch Platform 1.4.0 Full text search - Files 1.3.6
Notes: The files in question are an SMB mount using the app External Storages Files are OCR'd before ingest and index Background jobs set to cron.
Typically, I have a tmux session that I use to run and watch FTS as cron has been unreliable for me. I use the command:
sudo -u www-data php /var/www/nextcloud/occ fulltextsearch:index && sudo -u www-data php /var/www/nextcloud/occ fulltextsearch:live
I notice that while set to :live, many documents are not properly indexed. I have also noticed several times that my SMB mount seems to lose connection (I see the share having a grey or pink color in Files on the web interface indicating that the connection to SMB is down - this is cleared up as soon as I visit the External Storages page in the admin). I believe it's possible that when the connection is down, FTS misses the new docs and when the SMB share is reconnected, FTS is unaware that anything has been missed. ES only returns results once fulltextsearch:index has been manually run.
Perhaps this is really an External Storages bug but it affects FTS's ability to reliably index documents being added to a CIFS mounted directory.
I haven't dug in, but I'm guessing the cron is :live? Perhaps :index should be added as a cron until External Storages is more stable in its treatment of SMB connections?