docker-images
docker-images copied to clipboard
Oracle database 18 XE container fails to start on Centos
Hello,
First of all, unfortunately I am unable to post full logs due to the host being in a closed network as well as security issues, but I will try my best to find all the nesessary info.
Host: fresh CentOS 7 install Image: Oracle database 18.4.0 XE Docker version: 19.03.7 build 7141c199a2
Image was originally built on a windows machine and transferred to closed network Windows 10 host, where running it worked without problems. Transfering it on the linux host is where i ran into problems (I should be able to run the linux image on both, i assume).
$docker run --name xx -d -p 1521:1521 -p 5500:5500 qwerty:asdfg
-- Container starts up
Inside container though:
$lsnrctl status
TNS-12541 TNS:no listener
TNS: protocol adapter error
No listener
Linux Error: 111: Connection Refused.
$docker logs
Oracle base remains unchanged with value /opt/oracle
su: cannot open session: permission denied
#################################
DATABASE SETUP WAS NOT SUCCESSFUL!
From what I tried to google and find, SELinux might be a problem, but after setting it disabled, there is not change.
$sestatus
SELinux status: Disabled
Also tried adding the tmpfs line on /etc/fstab on centos machine
$vi /etc/fstab
UUID=ce2fce41-asdf2-...- /boot
/dev/mapper/centos-swap swap swap defaults 0 0
tmpfs /dev/shm tmpfs defaults,size=1024m 0 0
(Edited for formatting) Any help would be much appreciated, thanks!
That last line in your /etc/fstab is wrong (there shouldn't be a $ at the beginning). Can you paste the actual content of the /etc/fstab from your CentOS machine?
Also, I would strongly encourage you to install Oracle Linux 7 and use Oracle Container Runtime for Docker to both build and run the image, just so we can eliminate any weirdness that may have come from building the image on Windows. I don't think that's the problem, but I can't rule it out either.
We also need the output from docker info and docker logs (sanitised is fine) from your CentOS box because the su error is a known issue with XE and there is probably/hopefully something else in the logs that indicates the actual issue, like a filesystem issue somewhere.
Thank you very much Djelibeybi for a quick reply!
ah yes the $ in fstab is a typo i typed here.
$docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 1
Server v. 19.03.7
Storage Driver: overlay2
Backing filesys: unknown
Supports d_type: true
Native overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf jounald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default runtime: runc
Init Binary: docker-init
containerd version: hash
runc version: hash
init version: fec3683
security options:
seccomp
Profile default
Kernel version: 3.10.0-1062.e17.x86_64
OS: Centos linux 7 core
ostype: linux
arch: x86_64
cpus: 2
Total memory: 7,796GiB
name: xx
id: u74R:GYF2:REI3:GPOFOIQN:MM5NFM6L
docker root dir: /var/lib/docker
debug mode: false
registry: https://index.docker.io/v1/
labels:
experimental false
insecure registries:
127.0.0.0/8
live restore enabled: false
Product license: community
$docker logs <container>
The oracle base remains unchanged with the value /opt/oracle
su: cannot open session: permission denied
######
#error#
Database setup was not successful
The following output is now tail of the alert log:
Killed oracle process oracle@asd (Q003) iwt pid is 11, OS pid 122
Stopping background process MMNL
timestamp
Process termination requested for pid 112 (source = rdbms), (info = 2) (request issued by pid 2211, uid: 42145)
Process termination requested for pid 112 (source = rdbms), (info = 2) (request issued by pid 123, uid: 42145)
Stopping background process MMON
OS process OFSD idle for 30 seconds, exiting.
The Centos it self is basically a fresh install with most of usual tools. In addition to that I downloaded docker.tar from the website
- unzipped
- set usermod -aG
- $su dockerd &
- save & load the image from the previous host machine (Could it be something to do with the load instead of import?)
also, $docker run --name xx --shm-size=1g -p 1521:1521 -p 5500:5500 container-name
updated the /etc/fstab and removed the typo. Import or load, didnt make a difference
I found this problem which has similar effects, but correct me if im wrong, my issue should not be the same since I have commented out VOLUME from the docker file (i drop database dump inside the container and commit, to have a test database with same data always)
https://github.com/oracle/docker-images/issues/783
Second similar case with listener connection refused: https://stackoverflow.com/questions/34484848/unable-to-start-oracle-listener-in-docker-oracle-on-red-hat $stat / shows access 0755/drwxr-xr-x on both containers, the one working in windows host and the one not working on centos host
Probably a step closer, so I built the original oracle 18XE image, only editing one thing (#VOLUME) by adding comment to VOLUME in dockerfile.
I saved the image to tar, moved over to the centos machine and loaded the image. I ran it $docker run --name testdb --shm-size=1g -p 1521:1521 -p 5500:5500 test-db:latest ORACEL PASSWORD FOR SYS.... Specify a password.... Confirm the password: Configuring Oracle Listener. su: cannot open session: Permission Denied su: cannot open session: Permission Denied Listener configuration failed su: cannot open session: Permission Denied su: cannot open session: Permission Denied su: cannot open session: Permission Denied su: cannot open session: Permission Denied su: cannot open session: Permission Denied cp: cannot create regular file '...'
I believe this has to do with the default behavior of containers not allowing su inside. You can try to bypass that by commenting the following line in /etc/security/limits.d/oracle-database-preinstall-18c.conf :
# oracle-database-preinstall-18c setting for memlock hard limit is maximum of 128GB on x86_64 or 3GB on x86 OR 90 % of RAM
#oracle hard memlock 134217728
I think it is related to pam.d
Found a forked repo issue:
https://github.com/fuzziebrain/docker-oracle-xe/issues/15
sed -i -r 's/^(session\s+required\s+pam_limits.so)/#\1/' /etc/pam.d/* && \
This is the commit, works for me also:
https://github.com/dejvid-smth/docker-images/commit/c306376faa0c04832c212cf32ef17ee610368a80
This issue should have been fixed by now.
There has been changes to the pre-install package, and #1826 / #1829 improve podman support...
To add some additional information (not entirely sure how useful it is), we were running this image (built 30th June 2020) on EKS with Kubernetes 1.17 without any issues. We updated our cluster to 1.18 yesterday and now we have the problems described above.
The fix provided by @AmedeeBulle in #1826 doesn't seem to fix our problem, whereas adding the fuzziebrain/docker-oracle-xe#15 fix does...
I'm not particularly involved in managing the cluster, and in truth have a fairly limited experience of Docker and Kubernetes, but I'm baffled as to why this became a problem after upgrading our cluster...
Use the latest XE ie 21.3