open-semantic-search icon indicating copy to clipboard operation
open-semantic-search copied to clipboard

Missing dependencies, apt-get -f install, removes open-semantic-search

Open pdurusau opened this issue 3 years ago • 21 comments

Complete log follows but dpkg --install open-semantic-search_21.01.03.deb shows missing dependencies, then apt-get -f installs, removes open-semantic-search ??

OS: Debian 10 (buster)

Log as follows:

210116 15:14 /home/patrick/Downloads root@daat# dpkg --install open-semantic-search_21.01.03.deb Selecting previously unselected package open-semantic-search. (Reading database ... 310270 files and directories currently installed.) Preparing to unpack open-semantic-search_21.01.03.deb ... Unpacking open-semantic-search (20.01.17) ... dpkg: dependency problems prevent configuration of open-semantic-search: open-semantic-search depends on libapache2-mod-php (>= 0); however: Package libapache2-mod-php is not installed. Version of libapache2-mod-php on system, provided by libapache2-mod-php7.3:amd64, is . open-semantic-search depends on php-xml (>= 0); however: Package php-xml is not installed. open-semantic-search depends on php-bcmath (>= 0); however: Package php-bcmath is not installed. open-semantic-search depends on libapache2-mod-wsgi-py3 (>= 0); however: Package libapache2-mod-wsgi-py3 is not installed. open-semantic-search depends on python3-rdflib (>= 0); however: Package python3-rdflib is not installed. open-semantic-search depends on python3-pysolr (>= 0); however: Package python3-pysolr is not installed. open-semantic-search depends on python3-dateutil (>= 0); however: Package python3-dateutil is not installed. open-semantic-search depends on pst-utils (>= 0); however: Package pst-utils is not installed. open-semantic-search depends on python3-celery (>= 0); however: Package python3-celery is not installed. open-semantic-search depends on python3-nltk (>= 0); however: Package python3-nltk is not installed. open-semantic-search depends on rabbitmq-server (>= 0); however: Package rabbitmq-server is not installed. open-semantic-search depends on tesseract-ocr-all (>= 0); however: Package tesseract-ocr-all is not installed.

dpkg: error processing package open-semantic-search (--install): dependency problems - leaving unconfigured Processing triggers for systemd (241-7~deb10u5) ... Errors were encountered while processing: open-semantic-search 210116 15:16 /home/patrick/Downloads root@daat# apt-get -f install Reading package lists... Done Building dependency tree
Reading state information... Done Correcting dependencies... Done The following packages will be REMOVED: open-semantic-search 0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded. 1 not fully installed or removed. After this operation, 102 kB disk space will be freed. Do you want to continue? [Y/n] y (Reading database ... 312697 files and directories currently installed.) Removing open-semantic-search (20.01.17) ... Processing triggers for systemd (241-7~deb10u5) ...

pdurusau avatar Jan 16 '21 20:01 pdurusau

I built the entire system by hand, all components, about a year ago, as part of my learning curve. There were interlocking dependencies like this at the time. The only solution was to build the piece you wanted, after modifying its dependencies.

Based on my experience with OSS, I do not think Markus would have the time to untangle this. It's been a while since I touched this, but I recall keeping a directory of all the debs, which could be successfully installed as a group.

NetwarSystem avatar Jan 17 '21 04:01 NetwarSystem

It worked for me recently on ubuntu 20.04 but there were some issues after it was mostly up and running. Sorry I can't provide more info at present but I'm likely to try again in a few days

jazzdup avatar Jan 18 '21 16:01 jazzdup

Looking further, sudo apt install ./open... shows:


open-semantic-search :

Depends: python3-celery (>= 0) but it is not installable (outside of an env, system sees Python 2.7.16) Depends: rabbitmq-server (>= 0) but it is not installable (depends on Erlang, which is not installed)

I thought I would be clever and create a virtual Python3 environment and install Erlang. Same errors as before. I'll take the list and install all the dependencies and then try the install script.

pdurusau avatar Jan 19 '21 20:01 pdurusau

Both are packages which are available in Debian buster so the dependencies should be automatically resolved:

https://packages.debian.org/buster/python3-celery https://packages.debian.org/buster/rabbitmq-server

What is in your config /etc/apt/sources.list ?

Mandalka avatar Jan 19 '21 20:01 Mandalka

deb cdrom:[Debian GNU/Linux 10.4.0 Buster - Official amd64 NETINST 20200509-10:25]/ buster main

#deb cdrom:[Debian GNU/Linux 10.4.0 Buster - Official amd64 NETINST 20200509-10:25]/ buster main

deb http://deb.debian.org/debian/ buster main contrib non-free deb-src http://deb.debian.org/debian/ buster main contrib non-free

deb http://security.debian.org/debian-security buster/updates main contrib non-free deb-src http://security.debian.org/debian-security buster/updates main contrib non-free

buster-updates, previously known as 'volatile'

deb http://deb.debian.org/debian/ buster-updates main contrib non-free deb-src http://deb.debian.org/debian/ buster-updates main contrib non-free

This system was installed using small removable media

(e.g. netinst, live or single CD). The matching "deb cdrom"

entries were disabled at the end of the installation process.

For information about how to configure apt package sources,

see the sources.list(5) manual.

pdurusau avatar Jan 19 '21 21:01 pdurusau

OK, I can hand install (not tonight, I'm too tired) every dependency but python3-celery and rabbitmq-server. Why those two packages, alleged to be standard in the distribution sites are missing I have no idea. Will pursue in the AM when I'm rested. Ah, when I say missing, I run update in aptitude and when the package name is displayed, the far right columns are none and none.

pdurusau avatar Jan 20 '21 02:01 pdurusau

To continue this install saga, I can confirm that the typical Debian mirror does have rabbitmq-server, and python3-celery. https://deb.debian.org/debian/indices/files/typical.files (39 MB). However, despite this repository being listed in /etc/apt, these packages don't appear in aptitude. I'm investigating further. I want to hand install all the required packages and then test the opensemanticsearch install. Sorry for the noise.

pdurusau avatar Jan 20 '21 15:01 pdurusau

Update: After hand installing the dependencies, I still got an error message on libapache2-mod-php not being installed. I had verified that it was installed. The error message reads: "Version of libapache2-mod-php on system, provided by libapache2-mod-php7.3:amd64 is ." ??? Does it not like that version?

Anyway, I ran apt-get -f install and have an enormous install log with some fails, warnings about paths, etc., trying to read as it streams by. Running in a Emacs shell so I will capture all the output and work through it from there.

pdurusau avatar Jan 20 '21 22:01 pdurusau

@pdurusau I can confirm I just built this OK on debian buster on azure. It was just a case of: apt-get update dpkg --install open-semantic-search_21.01.03.deb apt-get -f install apt-get -f install (had to run fix twice 'cos needs jvm installed first) annoyingly the install downloads a spacy model for every language!? but it's up and running after about half an hour. the interface for flower is not available which I also found when trying on ubuntu 18 locally before, I haven't checked much else but it seems to be indexing ok

@Mandalka I tried emailing you from [email protected] - opensemanticsearch looks brilliant and would like to explore possible contributing or collab if possible?

jazzdup avatar Jan 21 '21 18:01 jazzdup

Thanks, that's encouraging! I suspect it may require a clean install.

One example:

"Restarting apache2 (via systemctl): apache2.serviceJob for apache2.service failed because the control process exited with error code. See "systemctl status apache2.service" and "journalctl -xe" for details. failed!"

systemctl status apache2.service shows:


apache2.service - The Apache HTTP Server Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Wed 2021-01-20 16:58:54 EST; 22h ago Docs: https://httpd.apache.org/docs/2.4/ Process: 31504 ExecStart=/usr/sbin/apachectl start (code=exited, status=1/FAILURE)

Jan 20 16:58:54 daat systemd[1]: Starting The Apache HTTP Server... Jan 20 16:58:54 daat apachectl[31504]: Action 'start' failed. Jan 20 16:58:54 daat apachectl[31504]: The Apache error log may have more information. Jan 20 16:58:54 daat systemd[1]: apache2.service: Control process exited, code=exited, status=1/FAILURE Jan 20 16:58:54 daat systemd[1]: apache2.service: Failed with result 'exit-code'. Jan 20 16:58:54 daat systemd[1]: Failed to start The Apache HTTP Server.


The log in /var/logs/apache2/error.log.1 reveals:

[Wed Jan 20 16:58:54.494766 2021] [wsgi:crit] [pid 31507] mod_wsgi (pid=31507): The mod_python module can not be used in conjunction with mod_wsgi 4.0+. Remove the mod_python module from the Apache configuration.

That may explain why connections to Solr weren't possible.

There were other errors, such as:

The scripts geoff, neokit and py2neo are installed in '/usr/local/bin' which is not on PATH.

But it isn't clear which user is missing that path. It's certainly in my path. Will have to look further.

The other think I may try is to install it as local to my directory to isolate it from the main system artifacts. There it would natively have Python3, etc.

pdurusau avatar Jan 21 '21 21:01 pdurusau

From my very minimal experience with this I'd highly recommend building it in the cloud instead of locally if that's what you're doing. Or use the virtualbox version. I'd like to build it locally as well at some point and contribute to dev if that's being welcomed at all...

jazzdup avatar Jan 22 '21 11:01 jazzdup

Does anyone here find a way to install a fresh open-semantic-search .deb in ubuntu or debian container? im stuck at postinit from tika -> /opt/postinst.tika-server.deb

#!/bin/sh

adduser --system --disabled-password tika
groupadd -f tesseract_cache
usermod -a -G tesseract_cache tika

# rights for OCR cache
chown tika:tesseract_cache /var/cache/tesseract
chmod 770 /var/cache/tesseract

# load our additional service config
systemctl daemon-reload

# start while booting
systemctl enable tika

# (re)start after installation
systemctl restart tika

any ideas to replace systemctl instructions?

mrtnzagustin avatar Apr 23 '21 22:04 mrtnzagustin

Yes the install went fine for me - try using a fresh ubuntu 20.0.4 or debian buster in the cloud. I didn't have any problems with tika as far as I remember. What version are you installing on?

jazzdup avatar Apr 24 '21 01:04 jazzdup

Is different installing in a docker container than a real virtualized debian, for example a docker container dont have access to init so cant execute any systemctl instruction. Thats the problem i think.

mrtnzagustin avatar Apr 24 '21 01:04 mrtnzagustin

i tried open-semantic-search_21.01.03 from web download and open-semantic-search_21.04.23 from a fresh build direct from git repository at the last version

mrtnzagustin avatar Apr 24 '21 01:04 mrtnzagustin

it's either; clone the repo and sub-modules then run docker-compose up OR get the debian installer and run it to install all the sub-components directly in your OS OR use virtualbox and run the vm OR search appliance but I haven't tried that

jazzdup avatar Apr 24 '21 02:04 jazzdup

tbh this project is a great idea but I found the code impossible to rewrite or extend so I've started writing my own simpler version(also currently waiting for neuralcoref to be upgraded to work with spacy v3 which should happen soon)

jazzdup avatar Apr 24 '21 03:04 jazzdup

@jazzdup Good to ear. I am using the project to discover tools like apache tika, spacy, etc. I am working in a few examples with python, python tika, tika server and ner with spacy. After that i want to add solr, faceted search etc. All as independent micro services. First steps here Tell me when you finish coding!!

I still wait for someone who cant install all build package in a clean docker ubuntu or debian.

mrtnzagustin avatar Apr 26 '21 01:04 mrtnzagustin

I built the entire system by hand, all components, about a year ago, as part of my learning curve. There were interlocking dependencies like this at the time. The only solution was to build the piece you wanted, after modifying its dependencies.

Based on my experience with OSS, I do not think Markus would have the time to untangle this. It's been a while since I touched this, but I recall keeping a directory of all the debs, which could be successfully installed as a group.

I wouldn't normally use an open issue for such a specific request, but I don't see a way to DM people in Github. Anyway, I use Ubuntu. I can install the desktop release of this, index and search. But, I have not been able to get the server release to completely install. I've downloaded the code, built it, and that approach doesn't work either. I tried the Docker approach, too. Each hangs up in a different place during the installation (Solr, Tika) and container deploy (Tika).

I'd really like to be able to install this by hand, installing the servers first and then deploying code in each server, rather than run a container or install Debian just for this. So, I want to ask you, NetwarSystem, how to go about doing what you did.

lsitongia avatar Feb 20 '22 19:02 lsitongia

@mrtnzagustin - only just noticed your comment, sorry. The repo had to go private unfortunately for legal reasons then the project's funding was stopped as it unfortunately started meandering in the wrong direction... away from the simple open source search service I'd wanted to build. Then on the NLP side it turned out impossible to get coreference resolution working with Spacy 3 so they decided to build their own coref solution sometime in the future but there's been little news on it lately.

jazzdup avatar Feb 21 '22 14:02 jazzdup

@lsitongia I'm not sure if it'll help but I did set it up OK on an Ubuntu 20.0.4LTS in the cloud - I think it was on Azure. Here are my personal notes in case you're curious to try that path, I'm sure it'll be totally different on whatever version of ubuntu you're using so good luck with that!

ossuser/ssh... oss_key chmod 400 apt-get update dpkg --install open-semantic-search_21.01.03.deb apt-get -f install apt-get -f install had to run it twice 'cos needs jvm installed first annoyingly downloads a spacy model for every language!? and flower not avail.

Setting up default-jre-headless (2:1.11-71) ... Setting up open-semantic-search (20.01.17) ... /opt/postinst.solr.deb: 5: /opt/postinst.solr.deb: java: not found No Java installed yet. Pleasy try again after Java has been installed (automatically because of dependencies) dpkg: error processing package open-semantic-search (--configure): installed open-semantic-search package post-installation script subprocess returned error exit status 1 Processing triggers for libc-bin (2.28-10) ... Processing triggers for systemd (241-7~deb10u5) ... Processing triggers for man-db (2.8.5-2) ... Processing triggers for ca-certificates (20200601~deb10u1) ... Updating certificates in /etc/ssl/certs... 0 added, 0 removed; done. Running hooks in /etc/ca-certificates/update.d...

done. done. Setting up openjdk-11-jre-headless:amd64 (11.0.9.1+1-1~deb10u2) ... update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/rmid to provide /usr/bin/rmid (rmid) in auto mode update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/java to provide /usr/bin/java (java) in auto mode update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/keytool to provide /usr/bin/keytool (keytool) in auto mode update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/jjs to provide /usr/bin/jjs (jjs) in auto mode update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/pack200 to provide /usr/bin/pack200 (pack200) in auto mode update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/unpack200 to provide /usr/bin/unpack200 (unpack200) in auto mode update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/jfr to provide /usr/bin/jfr (jfr) in auto mode update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/rmiregistry to provide /usr/bin/rmiregistry (rmiregistry) in auto mode update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/lib/jexec to provide /usr/bin/jexec (jexec) in auto mode Processing triggers for dictionaries-common (1.28.1) ... Errors were encountered while processing: open-semantic-search E: Sub-process /usr/bin/dpkg returned an error code (1)

Jan 21 17:46:21 oss su[30765]: pam_unix(su-l:session): session opened for user solr by (uid=0) Jan 21 17:46:21 oss solr[30763]: *** [WARN] *** Your open file limit is currently 1024. Jan 21 17:46:21 oss solr[30763]: It should be set to 65000 to avoid operational disruption. Jan 21 17:46:21 oss solr[30763]: If you no longer wish to see this warning, set SOLR_ULIMIT_C…r.in.sh Jan 21 17:46:21 oss solr[30763]: *** [WARN] *** Your Max Processes Limit is currently 31798. Jan 21 17:46:21 oss solr[30763]: It should be set to 65000 to avoid operational disruption. Jan 21 17:46:21 oss solr[30763]: If you no longer wish to see this warning, set SOLR_ULIMIT_C…r.in.sh Jan 21 17:46:21 oss solr[30763]: NOTE: Please install lsof as this script needs it to determin…t 8983. Jan 21 17:46:31 oss solr[30763]: Started Solr server on port 8983 (pid=30829). Happy searching! Jan 21 17:46:31 oss systemd[1]: Started LSB: Controls Apache Solr as a Service.

######################################################### issues: neo4j running but can't connect from remote - FIXED missing celery/flower, are older pip libs

todo: look at logs of various services as per url above - DONE try tunnelling first, then block ports - works for port 80 as root, but not 5555, 8983, 7474

  • DONE apache config blocking? seems not - DONE http://httpd.apache.org/docs/2.4/howto/access.html root@oss:/etc/apache2# grep -Rn Require .

django user/pass? check all files loaded, check how many/size + time why slow? - keeps restarting increase logging, look at more logs review datastax notes and track memory usage change langs to en only... might be causing crashes?

how to connect/ssh tunnel: https://www.ssh.com/ssh/tunneling/example e.g. sudo ssh -i oss_key.pem -L80:localhost:80 [email protected] use tunnel.sh [ip] 80, 7474, 7687

######################################################### tips: sudo -s /var/solr/logs/solr.log

/var/log/syslog

  • spacy NER keeps dying, OOM,

pip list | grep celery

  • 4.2.1 vs 4.4.7 on vm

/etc/solr-php-ui

  • for gui admin/apache.conf

/etc/opensemanticsearch/etl

  • connector debug on
  • configure all etl plugins for ocr, spacy, langs, etc.

/etc/opensemanticsearch/enhancer-rdf

  • configure external wikidata/schema.org url's to map rdf props to our facets

/etc/apache2/apache2.conf

  • access control for solr ui is in /opt/solr/server/etc/jetty-http.xml delete the default="127.0.0.1" from the config option "host" then service solr restart

/var/solr/logs/solr.log

config: /etc/opensemanticsearch/etl

  • set en only and verbose /etc/opensemanticsearch/connector-files

  • set config mappings e.g. for docker: config['mappings'] = { "/var/opensemanticsearch/": "http://localhost:8080/" } e.g. for debian: config['mappings'] = { "/var/www/documents/": "http://localhost:8080/" }

  • set verbose

deleting all files does NOT delete neo4j also: "Starting Exporter: Solr Not exported to Solr because no data or yet exported in this ETL run, because exporter was runned as plugin."

try again - check solr status first

on debian version, reference files as /var/www/documents and put docs in there. create that directory and change /etc/apache2/sites-enabled/000-default.conf to /var/www/ (strip off html) sudo service apache2 restart ensure /etc/opensemanticsearch/connector-files config['mappings'] = { "/var/www/": "http://localhost/" } sudo service opensemanticetl restart

#########################################################

jazzdup avatar Feb 21 '22 14:02 jazzdup