oracledb_exporter icon indicating copy to clipboard operation
oracledb_exporter copied to clipboard

oracledb exporter service keeps crashing

Open micksey9 opened this issue 3 years ago • 11 comments

The service runs and is able to scrape metrics for about 30 minutes and then it crashes. Don't see much info in journalctl logs even though I do have the log level set to debug. The DB is Oracle 19.8; the exporter release I'm using is 0.2.9; I've also tried the latest pre-release and I see the same issue with that. Following is my systemd-service file config:

[Unit] Description=Service for oracle telemetry client After=network.target [Service] Type=simple User=oracledb_exporter Group=oracledb_exporter Environment="DATA_SOURCE_NAME=user/password@//localhost:1521/nms" Environment="LD_LIBRARY_PATH=/opt/oracledb_exporter/instantclient_19_10" Environment="ORACLE_HOME=/opt/oracledb_exporter/instantclient_19_10" #Environment="CUSTOM_METRICS=/etc/oracledb_exporter/custom-metrics.toml" ExecStart=/usr/local/bin/oracledb_exporter
--default.metrics "/etc/oracledb_exporter/default-metrics.toml"
--log.level debug --web.listen-address 0.0.0.0:9161 [Install] WantedBy=multi-user.target

The oracledb_exporter binary is owned by oracledb_exporter user; same for /etc/oracledb_exporter as well as the metrics file. Following is what I see in journalctl logs:

Apr 21 22:51:50 oracledb_exporter[1993306]: time="2021-04-21T22:51:50Z" level=debug msg="Successfully scrapped metric: tablespace" source="main.go:238" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="Successfully pinged Oracle database: " source="main.go:210" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="About to scrape metric: " source="main.go:215" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="- Metric MetricsDesc: map[value:Gauge metric with count of sessions by status and type.]" source="main.go:216" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="- Metric Context: sessions" source="main.go:217" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="- Metric MetricsType: map[]" source="main.go:218" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="- Metric Labels: [status type]" source="main.go:219" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="- Metric FieldToAppend: " source="main.go:220" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="- Metric IgnoreZeroResult: false" source="main.go:221" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="- Metric Request: SELECT status, type, COUNT(*) as value FROM v$session GROUP BY status, type" source="main.go:222" Apr 21 22:52:11 oracledb_exporter[1993306]: time="2021-04-21T22:52:11Z" level=debug msg="Calling function ScrapeGenericValues()" source="main.go:263" Apr 21 22:52:11 systemd[1]: oracledb_exporter.service: Main process exited, code=killed, status=11/SEGV Apr 21 22:52:11 systemd[1]: oracledb_exporter.service: Failed with result 'signal'. Apr 21 22:52:12 systemd-coredump[1994826]: Process 1993306 (oracledb_export) of user 2003 dumped core. #0 0x0000000000466de1 n/a (oracledb_exporter) #1 0x000000000044bb9a n/a (oracledb_exporter) #2 0x000000000044a776 n/a (oracledb_exporter) #3 0x0000000000467143 n/a (oracledb_exporter) -- Subject: Process 1993306 (oracledb_export) dumped core -- Process 1993306 (oracledb_export) crashed and dumped core.

Core files are being generated for every crash, this is the content for the latest one:

   Storage: /var/lib/systemd/coredump/core.oracledb_export.2003.6f0a6b7367b440ac8146646086e10fb0.1993306.1619045531000000.lz4
   Message: Process 1993306 (oracledb_export) of user 2003 dumped core.

            Stack trace of thread 1994824:
            #0  0x0000000000466de1 n/a (oracledb_exporter)
            #1  0x000000000044bb9a n/a (oracledb_exporter)
            #2  0x000000000044a776 n/a (oracledb_exporter)
            #3  0x0000000000467143 n/a (oracledb_exporter)
            #4  0x00007f6ae8bc8dd0 __restore_rt (libpthread.so.0)
            #5  0x00007f6ae8bc7615 unwind_stop (libpthread.so.0)
            #6  0x00007f6abc41cb9e _Unwind_ForcedUnwind_Phase2 (libgcc_s.so.1)
            #7  0x00007f6abc41d210 _Unwind_ForcedUnwind (libgcc_s.so.1)
            #8  0x00007f6ae8bc76e6 __pthread_unwind (libpthread.so.0)
            #9  0x00007f6ae8bbf56b pthread_exit (libpthread.so.0)
            #10 0x00007f6ae6d8b20a SltsqSigFunc (libclntshcore.so.19.1)
            #11 0x00007f6ae6d8de5b sslssAsynchHdlr (libclntshcore.so.19.1)
            #12 0x00007f6ae6d8d881 sslsshandler (libclntshcore.so.19.1)
            #13 0x00007f6ae8bc8dd0 __restore_rt (libpthread.so.0)
            #14 0x00007f6ae88efe75 __clone (libc.so.6)
            #15 0x00000000000007d3 n/a (n/a)

micksey9 avatar Apr 21 '21 23:04 micksey9

Can you try to update the version of the binaries and build the exporter by yourself?

Yannig avatar Apr 27 '21 07:04 Yannig

do you have any documentation or any online resource that you can point me to that covers the steps to build the exporter by myself?

micksey9 avatar May 01 '21 17:05 micksey9

@Yannig built exporter with oracle-instantclient19.11 Oracle Database 19c Standard Edition 2 19.0.0.0.0 19.11.0.0.0 Production

       Message: Process 2527312 (oracledb_export) of user 0 dumped core.

                Stack trace of thread 2527322:
                #0  0x0000000000467741 n/a (oracledb_exporter)
                #1  0x000000000044c63a _start (oracledb_exporter)
                #2  0x000000000044b216 _start (oracledb_exporter)
                #3  0x0000000000467aa3 n/a (oracledb_exporter)
                #4  0x00007f706e466dd0 __restore_rt (libpthread.so.0)
                #5  0x00007f706e465615 unwind_stop (libpthread.so.0)
                #6  0x00007f7035f0eb9e _Unwind_ForcedUnwind_Phase2 (libgcc_s.so.1)
                #7  0x00007f7035f0f210 _Unwind_ForcedUnwind (libgcc_s.so.1)
                #8  0x00007f706e4656e6 __pthread_unwind (libpthread.so.0)
                #9  0x00007f706e45d56b pthread_exit (libpthread.so.0)
                #10 0x00007f706c77ad0a SltsqSigFunc (libclntshcore.so.19.1)
                #11 0x00007f706c62ebbb sslssAsynchHdlr (libclntshcore.so.19.1)
                #12 0x00007f706c62e5e1 sslsshandler (libclntshcore.so.19.1)
                #13 0x00007f706e466dd0 __restore_rt (libpthread.so.0)
                #14 0x00007f706e18de85 __clone (libc.so.6)
                #15 0x0000000000000000 n/a (n/a)

lexx-bright avatar May 13 '21 11:05 lexx-bright

we have the same issue

Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production 19.11.0.0.0 OS RHEL 7

zhzhff avatar Jul 23 '21 06:07 zhzhff

Is there a resolution provided for this issue? Facing the same with the exporter for version 19c. It crashes after a quite amount of time. Please assist

manjukashyap avatar Nov 09 '21 04:11 manjukashyap

I am also facing the same issue, PLEASE ADVISE

STATUS LOG

[root@ip-17X-XX-11-198 prometheus]# systemctl status oracledb_exporter -l ● oracledb_exporter.service - oracle_database_exporter Loaded: loaded (/etc/systemd/system/oracledb_exporter.service; disabled; vendor preset: disabled) Active: failed (Result: signal) since Tue 2021-11-16 06:21:21 UTC; 38min ago Process: 6828 ExecStart=/usr/local/bin/oracledb_exporter --default.metrics /etc/oracledb_exporter/default-metrics.toml --log.level error --web.listen-address 0.0.0.0:9161 (code=killed, signal=SEGV) Main PID: 6828 (code=killed, signal=SEGV)

Nov 16 06:16:17 ip-17X-XX-11-198.ap-south-1.compute.internal systemd[1]: Started oracle_database_exporter. Nov 16 06:21:21 ip-17D-SS-11-198.ap-south-1.compute.internal systemd[1]: oracledb_exporter.service: main process exited, code=killed, status=11/SEGV Nov 16 06:21:21 ip-17D-CX-11-198.ap-south-1.compute.internal systemd[1]: Unit oracledb_exporter.service entered failed state. Nov 16 06:21:21 ip-17X-DF-11-198.ap-south-1.compute.internal systemd[1]: oracledb_exporter.service failed.

OracleMonitoring avatar Nov 16 '21 07:11 OracleMonitoring

Same here, Running the latest binary (version 0.3.2) on Centos 7, monitoring a 19C DB. Exporter keeps crashing with status=11/SEGV

Same goes for exporter versions 0.2.9 and 0.3.0

agmimidi avatar Nov 30 '21 12:11 agmimidi

Any updated on this pls?

manjukashyap avatar Jan 06 '22 15:01 manjukashyap

Same. Running the latest binary on Centos 7, monitoring a 19C DB. Exporter keeps crashing with status=11/SEGV

azdfzshffg avatar Jan 11 '22 14:01 azdfzshffg

any updated?

azdfzshffg avatar Feb 15 '22 08:02 azdfzshffg

Experiencing the same issue. Running on Amazon Linux2, monitoring a RDS 19c DB with 19.10 client. Getting status=11/SEGV after a period of time.

This seems to be the same issue as reported in #167

Dugera27 avatar Mar 10 '22 18:03 Dugera27