open-semantic-search
open-semantic-search copied to clipboard
Problem with some Filetypes and celery/flower
Hello!
If I "index dir" sometimes the cpu goes to 100% and extracting is stuck so after some testing i have seen that the "problem-files" wich are extracting are from type ".STEP" / ".SLDASM" / ".SLDDRW" but not always.
So I have to go to flower ui and terminate all running tasks (8 tasks) and restart service "opensemanticetl" than the extracting goes on.
Is there a explanation why this problem happens?
By the way is there a possibility to have the filetypes like "STEP" in the index but prevent to extract them? Because we do not need an extraction of them.
Many thanks!
After some more tests I think it's definitely a problem with some "3D-Files".
- Index a directory with 1600 files and no 3D Files like STEP usw. --> Perfekt
- Index a directory with 600 files and many STEP files --> nightmare - I have to terminate most of the taks from flower.
I don't know why but I think to exclude theeses files fomr extracting would help but don't know how to manage.
Thanks!
I also just had the Indexing Service stuck. Turned out I acidentally broke my config file.
First make sure your config file is allright. You may test this by indexing a specific file
opensemanticsearch-index-file /path/file.pdf
If you get a python error, that may indicate your problem.
The other independent machinery are the opensemanticsearch services. You have
- Opensemanticsearch ETL
- Opensemanticsearch ETL-Filemonitoring
(On Debian) you may check its health by executing:
systemctl --full --type service --all
Start/stop/get status by
systemctl service status opensemanticetl
systemctl service stop opensemanticetl-filemonitoring
use service status to check whats going on around those 2 services.