fulltextsearch
fulltextsearch copied to clipboard
Fulltextsearch doesn't add new documents from external cifs/samba storage
Hi, may be this is just a configuration issue. I'm using nextcloud/owncloud already for a long time at least since version 6. Now I'm on Nextcloud 13.0.1 and recently (on version 12) changed from Nextant to elasticsearch.
Indexing of local files stored in Nextcloud itself works as a charme. But when it comes to indexing files on external mounted smb shares new files are not recognized as new ans thus not indexed.
Just a few remarks about my installation.
OS: Ubuntu 16.04 (i7-3773 @3.4 Ghz, 32GB RAM)
DB: Postgres 10
Webserver: Nginx
Elasticsearch 6.2.3 in Docker container with indices on SSD
Nextcloud cron selected for background job processing (not Ajax or webcron)
crontab
*/15 * * * * www-data php -f /var/www/nextcloud/cron.php
*/17 * * * * www-data php -f /var/www/nextcloud/occ files:scan --all --quiet
systemd daemon for live indexing
[Unit]
Description=Elasticsearch Worker for Nextcloud Fulltext Search
After=network.target
[Service]
User=www-data
Group=www-data
WorkingDirectory=/var/www/nextcloud
ExecStart=/usr/bin/php /var/www/nextcloud/occ fulltextsearch:live
Nice=19
Restart=always
[Install]
WantedBy=multi-user.target
All fulltext apps on most recent version 0.61 except Full text search - Files which is on 0.60 due to the lack of a newer version
Any help would be greatly appreciated.
Many thanks in advance.
Hey @lhurt I look like a known problem https://github.com/nextcloud/fulltextsearch/issues/250
Thanks for the information. We'll see after the update.
Hi I think there is a new unexplained term "workingdirectory" I assume that this is where the nextcloud code resides, not the data is "datadirectory"
Please keep me update after the release of fulltextsearch 0.7 (within the next few days)
Works for me now with 0.72.
Sorry. Tested only by adding files to the external storage in Nexcloud via Web Interface. This works. If I add files to a samba share on a windows client it's not indexed.
Might be an issue with the sync and/or the event of a new file is not dispatched, therefor, fulltextsearch is not aware that a new file have been uploaded
How can this be solved as I'd suppose this to be a very common use case?
If I add files to a samba share on a windows client it's not indexed.
@lhurt did you tell your cifs client that it should use version 2?
@Sanookmakmak I suppose you mean the smb protocol version? I set it to smb2 but the issue remains.
@Sanookmakmak Now I even went to smb3 and still the issue remains.
Upgraded to 13.0.4 and unfortunately still no improvement
Upgraded to 14.0.1 and fulltextsearch 0.99.2/3/4 and still no solution
Current steps to reproduce
- Store any text file on directory located on samba share
- Delete index
- Rebuild index
- Start live indexing
- Copy file from step 1 in same directory with different name
- Wait 1 day
- Search for text contained in text file from step 1
- Result only one hit with filename from file in step 1
Expected:
2 hits. Files from step 1 and step 5
Very disappointing.
Unintentionally closed. The issue still persists even with version 1.0
With version 1.2.3 the issue seems to be resolved and everything is working as expected so far.
Have to reopen it.
With Nextcloud 15.0.6 and full text search 1.2.5, full text search - files 1.2.6 files are not added to the index when they are created on the file system, e.g. when the folder is mounted as a Windows drive.
Did a complete reinstall/reindex to verify that it's not corrupted leftover data.
and yes, the entry
d /var/run/samba 2775 root www-data - -
is also present in /usr/lib/tmpfiles.d/samba.conf of the host system
I don't understand why it doesn't work.
This should be fixed on NC16, please keep me updated if you have a chance to test it !
This should be fixed on NC16, please keep me updated if you have a chance to test it !
Thanks very much for the fast reply Since NC16 is scheduled for release on April 25th, I'll wait for it and reply then.
After testing with NC16 I have to say that the issue wasn't fixed, unfortunately,
To reproduce I use the steps described above using windows explorer as samba client.
The interactive live index screen didn't show any activity until I copied the file in nextcloud's browser interface. Then the display was as expected.
Memory: 15 MB
┌─ Indexing ────
│ Action: waiting
│ Provider: Files Account: Ludwig
│ Document: 26109787
│ Info: text/plain
│ Title: _Dokumente/_ignoriere_mich/TopBankingError (kopieren).txt
│ Content size: 3988
└──
┌─ Results ────
│ Result: 1/1
│ Index: files:26109787
│ Status: ok
│ Message: {"_index":"my_nextcloud","_type":"standard","_id":"files:26109787",
│ "_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":
│ 0},"_seq_no":97697,"_primary_term":1}
└──
┌─ Errors ────
│ Error: 1/1
│ Index: files:16598
│ Exception: Elasticsearch\Common\Exceptions\ServerErrorResponseException
│ Message: java.lang.IllegalArgumentException: java.lang.IllegalArgumentExcept
│ ion: field [content] not present as part of path [attachment.content]
│
└──
## x:first result ## c/v:prec/next result ## b:last result
## f:first error ## h/j:prec/next error ## d:delete error ## l:last error
## q:quit ## p:pause
Looks like the live indexer doesn't watch inotify. Inotify is running and accessible within the docker container. I tested this with another container.
In addition to that I ran the external file notify command before copying a file in Windows explorer.
docker exec --user www-data nextcloud_php_fpm php occ files_external:notify -vvu user-p password 17
Self-test successful
added /_ignoriere_mich/TopBankingError - Kopie.txt
modified /_ignoriere_mich/TopBankingError - Kopie.txt
So nextcloud is notified about what's going on. The live index monitor stays silent
Memory: 12 MB
┌─ Indexing ────
│ Action: waiting
│ Provider: Account:
│ Document:
│ Info:
│ Title:
│ Content size:
└──
┌─ Results ────
│ Result: 0/0
│ Index:
│ Status:
│ Message:
│
│
└──
┌─ Errors ────
│ Error: 1/1
│ Index: files:16598
│ Exception: Elasticsearch\Common\Exceptions\ServerErrorResponseException
│ Message: java.lang.IllegalArgumentException: java.lang.IllegalArgumentExcept
│ ion: field [content] not present as part of path [attachment.content]
│
└──
## x:first result ## c/v:prec/next result ## b:last result
## f:first error ## h/j:prec/next error ## d:delete error ## l:last error
## q:quit ## p:pause
So the notification seems to be there it just doesn't trigger the indexer.
Any idea what's going wrong?
New update 16.0.1 didn't improve the situation. Same behavior.
@icewind1991 would you care having a look ?
You are using Ubuntu 16.04 with smb protocol v2 or v3? Maybe this is related icewind1991/SMB/issues/56
Of course I use SMB > 1 as it is deprecated and Windows will only connect with workarounds that I don't want to apply.
Here's my relevant part of smb.conf
---- snip ----
client min protocol = SMB2
client max protocol = SMB3
---- snip ----
Nevertheles I just noticed that there may be an issue due to the fact, that i'm using docker. My installation is based on the fpm image which is itself based on debian stretch. And here the smbclient version is 4.5.16! which is very far behind as 4.10 is current.
I'll try changing the base image to fpm-alpine that should have a 4.10 smbclient and may be this solves it. As soon as I have results I'll post it.
But the notify problem is only related to you if you use occ files_external:notify.
But if I see, you are using the cron with files:scan and you doesn't use the notify as in External Storage SMB/CIFS described.
This issue exists now for over 1 year and it seems like I'm the only one having this problem. IS this really the case? Is it such an extraordinary use case?
Still doesn't work with fulltext 1.3.6 and fulltext_files 1.3.5
Does anyone else have this working?
I can confirm this problem with nextcloud 16.0.3 and fulltext 1.3.6 and fulltext_files 1.3.5.
If I create a new file directly on the share this file is not indexed. occ files_external:notify is running and live index is running too.
I tested something one time and created a new file. The file doesn't seem to be updated via notify, but after about 24h the file was included in the fulltextsearch.
Same problem with N16 Going to test the N17 will post the results here