uyuni icon indicating copy to clipboard operation
uyuni copied to clipboard

TaskomaticDaemon - Failure occured during job recovery

Open sysy69 opened this issue 3 years ago • 16 comments
trafficstars

Problem description

Taskomatic is crashing all 10Sec.

Version of Uyuni Server and Proxy (if used)

Information for package Uyuni-Server-release:

Repository : uyuni-server-stable Name : Uyuni-Server-release Version : 2022.05-180.1.uyuni1 Arch : x86_64 Vendor : obs://build.opensuse.org/systemsmanagement:Uyuni Support Level : Level 3 Installed Size : 1.4 KiB Installed : Yes Status : up-to-date Source package : Uyuni-Server-release-2022.05-180.1.uyuni1.src Summary : Uyuni Server Description : Uyuni lets you efficiently manage physical, virtual, and cloud-based Linux systems. It provides automated and cost-effective configuration and software management, asset management, and system provisioning.

Details about the issue

Taskomatic stopt working, no updates or anything else was made.

error: [Thread-43] FATAL com.redhat.rhn.taskomatic.core.TaskomaticDaemon - Failure occured during job recovery.

/var/log/rhn/rhn_taskomatic_daemon.log

2022-06-09 12:53:35,015 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000014: Found use of deprecated [org.hibernate.id.SequenceGenerator] sequence-based id generator; use org.hibernate.id.enhanced.SequenceStyleGenerator instead.  See Hibernate Domain Model Mapping Guide for details.
2022-06-09 12:53:35,015 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000014: Found use of deprecated [org.hibernate.id.SequenceGenerator] sequence-based id generator; use org.hibernate.id.enhanced.SequenceStyleGenerator instead.  See Hibernate Domain Model Mapping Guide for details.
2022-06-09 12:53:38,618 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['s.class']; use TYPE operator instead : type(s)
2022-06-09 12:53:38,742 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['s.class']; use TYPE operator instead : type(s)
2022-06-09 12:53:38,850 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['sa.server.class']; use TYPE operator instead : type(sa.server)
2022-06-09 12:53:38,861 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['sa.server.class']; use TYPE operator instead : type(sa.server)
2022-06-09 12:53:38,870 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['s.class']; use TYPE operator instead : type(s)
2022-06-09 12:53:39,291 [Thread-43] FATAL com.redhat.rhn.taskomatic.core.TaskomaticDaemon - Failure occured during job recovery.
2022-06-09 12:53:42,284 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH020100: The Ehcache second-level cache provider for Hibernate is deprecated.  See https://hibernate.atlassian.net/browse/HHH-12441 for details.
2022-06-09 12:53:43,027 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000009: The outer-join attribute on <many-to-many> has been deprecated. Instead of outer-join="false", use lazy="extra" with <map>, <set>, <bag>, <idbag>, or <list>, which will only initialize entities (not as a proxy) as needed.
2022-06-09 12:53:43,039 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000005: Defining an entity [com.redhat.rhn.domain.channel.PublicChannelFamily] with no physical id attribute is no longer supported; please map the identifier to a physical entity attribute
2022-06-09 12:53:43,045 [Thread-43] WARN  org.hibernate.orm.deprecation - HHH90000009: The outer-join attribute on <many-to-many> has been deprecated. Instead of outer-join="false", use lazy="extra" with <map>, <set>, <bag>, <idbag>, or <list>, which will only initialize entities (not as a proxy) as needed.

_

sysy69 avatar Jun 09 '22 10:06 sysy69

Can you please provide complete logs?

aaannz avatar Jun 21 '22 12:06 aaannz

@sysy69 can you please enable log level debug and provide the complete /var/log/rhn/rhn_taskomatic_daemon.log ? for enabling debug log uncomment line

log4j.logger.com.redhat.rhn=DEBUG

from /usr/share/rhn/classes/log4j.properties

mbussolotto avatar Jun 27 '22 08:06 mbussolotto

We're seeing the same issue on 2022.05. However, I don't see a log4j.properties in that directory.

Should this be the edit?

     <Logger name="com.redhat.rhn.taskomatic.SchedulerKernel" level="debug" />
        <Logger name="com.redhat.rhn.taskomatic.task" level="debug" />

from /usr/share/rhn/classes/log4j2.xml

hbrown-uiowa avatar Jul 27 '22 22:07 hbrown-uiowa

I see this leading up to the fatal entries:

2022-07-27 18:12:21,817 [Thread-41] DEBUG com.redhat.rhn.common.localization.XmlMessages - Reloading BUNDLE : com.redhat.rhn.branding.strings.StringResource.en_US
2022-07-27 18:12:21,818 [Thread-41] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() called with: java.supported_locales
2022-07-27 18:12:21,818 [Thread-41] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> Getting property: supported_locales
2022-07-27 18:12:21,818 [Thread-41] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> result: null
2022-07-27 18:12:21,818 [Thread-41] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> returning: bn_IN,ca,de,en_US,es,fr,gu,hi,it,ja,ko,pa,pt,pt_BR,ru,ta,zh_CN,zh_TW
2022-07-27 18:12:23,901 [Thread-41] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['s.class']; use TYPE operator instead : type(s)
2022-07-27 18:12:24,032 [Thread-41] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['s.class']; use TYPE operator instead : type(s)
2022-07-27 18:12:24,147 [Thread-41] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['sa.server.class']; use TYPE operator instead : type(sa.server)
2022-07-27 18:12:24,158 [Thread-41] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['sa.server.class']; use TYPE operator instead : type(sa.server)
2022-07-27 18:12:24,166 [Thread-41] WARN  org.hibernate.orm.deprecation - HHH90000017: Found use of deprecated entity-type selector syntax in HQL/JPQL query ['s.class']; use TYPE operator instead : type(s)
2022-07-27 18:12:24,220 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - startMessaging() - start
2022-07-27 18:12:24,221 [Thread-41] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() called with: java.message_queue_thread_pool_size
2022-07-27 18:12:24,221 [Thread-41] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> Getting property: message_queue_thread_pool_size
2022-07-27 18:12:24,221 [Thread-41] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> result: null
2022-07-27 18:12:24,221 [Thread-41] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> returning: 5
2022-07-27 18:12:24,222 [Thread-41] INFO  com.redhat.rhn.common.messaging.MessageQueueThreadPool - Started message queue thread pool (size: 5)
2022-07-27 18:12:24,223 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - startMessaging() - end
2022-07-27 18:12:24,224 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.TraceBackAction@17415034 class: com.redhat.rhn.frontend.events.TraceBackEvent
2022-07-27 18:12:24,225 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.NewUserAction@49e84202 class: com.redhat.rhn.frontend.events.NewUserEvent
2022-07-27 18:12:24,226 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.UpdateErrataCacheAction@158e3e42 class: com.redhat.rhn.frontend.events.UpdateErrataCacheEvent
2022-07-27 18:12:24,226 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.RestartSatelliteAction@167a09dc class: com.redhat.rhn.frontend.events.RestartSatelliteEvent
2022-07-27 18:12:24,227 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmChangeBaseChannelSubscriptionsAction@14be1535 class: com.redhat.rhn.frontend.events.SsmChangeBaseChannelSubscriptionsEvent
2022-07-27 18:12:24,227 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmChangeChannelSubscriptionsAction@2ac9aa5c class: com.redhat.rhn.frontend.events.SsmChangeChannelSubscriptionsEvent
2022-07-27 18:12:24,227 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmDeleteServersAction@63288813 class: com.redhat.rhn.frontend.events.SsmDeleteServersEvent
2022-07-27 18:12:24,228 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmInstallPackagesAction@7b5f827c class: com.redhat.rhn.frontend.events.SsmInstallPackagesEvent
2022-07-27 18:12:24,229 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmRemovePackagesAction@514e049f class: com.redhat.rhn.frontend.events.SsmRemovePackagesEvent
2022-07-27 18:12:24,229 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmVerifyPackagesAction@2d011722 class: com.redhat.rhn.frontend.events.SsmVerifyPackagesEvent
2022-07-27 18:12:24,230 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmUpgradePackagesAction@10cad139 class: com.redhat.rhn.frontend.events.SsmUpgradePackagesEvent
2022-07-27 18:12:24,230 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmPowerManagementAction@b0fa3cb class: com.redhat.rhn.frontend.events.SsmPowerManagementEvent
2022-07-27 18:12:24,231 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.CloneErrataAction@41ec90bb class: com.redhat.rhn.frontend.events.CloneErrataEvent
2022-07-27 18:12:24,231 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.NewCloneErrataAction@2e1f45e5 class: com.redhat.rhn.frontend.events.NewCloneErrataEvent
2022-07-27 18:12:24,232 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmErrataAction@49e6d254 class: com.redhat.rhn.frontend.events.SsmErrataEvent
2022-07-27 18:12:24,233 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.AlignSoftwareTargetAction@5722666 class: com.redhat.rhn.frontend.events.AlignSoftwareTargetMsg
2022-07-27 18:12:24,233 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.ScheduleRepoSyncAction@46c7f070 class: com.redhat.rhn.frontend.events.ScheduleRepoSyncEvent
2022-07-27 18:12:24,233 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmSystemRebootAction@3841ff3f class: com.redhat.rhn.frontend.events.SsmSystemRebootEvent
2022-07-27 18:12:24,234 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.redhat.rhn.frontend.events.SsmConfigFilesAction@93ef8a9 class: com.redhat.rhn.frontend.events.SsmConfigFilesEvent
2022-07-27 18:12:24,235 [Thread-41] DEBUG com.redhat.rhn.common.messaging.MessageQueue - registerAction(MessageAction, Class) - : com.suse.manager.reactor.messaging.ChannelsChangedEventMessageAction@25a143c3 class: com.suse.manager.reactor.messaging.ChannelsChangedEventMessage
2022-07-27 18:12:24,373 [Thread-41] FATAL com.redhat.rhn.taskomatic.core.TaskomaticDaemon - Failure occured during job recovery.

hbrown-uiowa avatar Jul 27 '22 23:07 hbrown-uiowa

We updated to 2022.06 and the issue persists. We can get taskomatic to generate this error:

Traceback (most recent call last):
	3: from ./uyuni_channel.rb:16:in `<main>'
	2: from ./uyuni_channel.rb:16:in `each'
	1: from ./uyuni_channel.rb:25:in `block in <main>'
/usr/lib64/ruby/gems/2.5.0/gems/xmlrpc-0.3.0/lib/xmlrpc/client.rb:272:in `call': redstone.xmlrpc.XmlRpcFault: Taskomatic is not available. (XMLRPC::FaultException)

by calling this api: sync_schedule = @client.call('channel.software.getRepoSyncCronExpression', @key, channel['label'])

hbrown-uiowa avatar Jul 28 '22 20:07 hbrown-uiowa

The last successful action before this started failing for us was:

Auto Patch Updates: 	2022-07-24 15:31:45 CDT 	FINISHED

as seen in https://uyuniserver/rhn/manager/admin/runtime-status

hbrown-uiowa avatar Aug 01 '22 18:08 hbrown-uiowa

@sysy69 can you please enable log level debug and provide the complete /var/log/rhn/rhn_taskomatic_daemon.log ? for enabling debug log uncomment line

log4j.logger.com.redhat.rhn=DEBUG

from /usr/share/rhn/classes/log4j.properties

@hbrown-uiowa if /usr/share/rhn/classes/log4j.properties does not exist, please create it, add log4j.logger.com.redhat.rhn=DEBUG and then rctaskomatic restart. You should also add the same line to /srv/tomcat/webapps/rhn/WEB-INF/classes/log4j.properties (create the file if it doesn't exist) then rctomcat restart . Then reproduce the issue and provide these two logfiles:

  • /var/log/rhn/rhn_web_ui.log
  • /var/log/rhn/rhn_taskomatic_daemon.log

Logs will be really verbose so after this test you may want to disable them, to save disk space (so remove /var/log/rhn/rhn_taskomatic_daemon.log from the two files and restart the processes)

mbussolotto avatar Aug 02 '22 08:08 mbussolotto

I enabled the debug settings from above and restarted but didn't seem to add additional logging for me

https://gist.github.com/megamaced/7eb5a7d4e6e825da5bebdee166d00243

megamaced avatar Aug 09 '22 10:08 megamaced

Is there any work around to this issue? We can't sync repodata

megamaced avatar Aug 09 '22 10:08 megamaced

still broken in 2022.08

megamaced avatar Aug 10 '22 12:08 megamaced

I suspect the issue is with quartz scheduler. Is there a way to clean out the jobs?

uyuni=# select * from qrtz_simple_triggers; trigger_name | trigger_group | repeat_count | repeat_interval | times_triggered | sched_name
-------------------------------------+---------------+--------------+-----------------+-----------------+------------------------ repo-sync-1-149-retry20220724154716 | 1 | 0 | 0 | 1 | DefaultQuartzScheduler repo-sync-1-148-retry20220724143715 | 1 | 0 | 0 | 0 | DefaultQuartzScheduler repo-sync-1-131-retry20220724153236 | 1 | 0 | 0 | 0 | DefaultQuartzScheduler repo-sync-1-128-retry20220724132525 | 1 | 0 | 0 | 0 | DefaultQuartzScheduler repo-sync-1-147-retry20220724133756 | 1 | 0 | 0 | 0 | DefaultQuartzScheduler repo-sync-1-151-retry20220724155226 | 1 | 0 | 0 | 0 | DefaultQuartzScheduler repo-sync-5-161-retry20220724155650 | 5 | 0 | 0 | 0 | DefaultQuartzScheduler (7 rows)

uyuni=# select trigger_name,trigger_group,job_name,trigger_state from qrtz_triggers WHERE trigger_name LIKE '%retry%'; trigger_name | trigger_group | job_name | trigger_state -------------------------------------+---------------+-----------------+--------------- repo-sync-1-149-retry20220724154716 | 1 | repo-sync-1-149 | COMPLETE repo-sync-1-149-retry20220724155311 | 1 | repo-sync-1-149 | WAITING repo-sync-1-131-retry20220724153236 | 1 | repo-sync-1-131 | WAITING repo-sync-1-128-retry20220724132525 | 1 | repo-sync-1-128 | WAITING repo-sync-1-147-retry20220724133756 | 1 | repo-sync-1-147 | WAITING repo-sync-1-151-retry20220724155226 | 1 | repo-sync-1-151 | WAITING repo-sync-5-161-retry20220724155650 | 5 | repo-sync-5-161 | WAITING repo-sync-1-148-retry20220724143715 | 1 | repo-sync-1-148 | ACQUIRED (8 rows)

ddholstad avatar Aug 29 '22 16:08 ddholstad

takomatic started after running: uyuni=# delete from qrtz_simple_triggers where trigger_name like '%retry%'; DELETE 7 uyuni=# delete from qrtz_triggers WHERE trigger_name LIKE '%retry%'; DELETE 8

We are waiting to see if it stops in the same part of the schedule when it gets back to it.

ddholstad99 avatar Aug 29 '22 19:08 ddholstad99

We had the same issue... Thanks @ddholstad99 for those two SQL statements. Taskomatic wont fail now after 10 seconds.

MaxHerrmannSVA avatar Sep 02 '22 13:09 MaxHerrmannSVA

same problem... thanks @ddholstad99 for the statements

sayingwhat avatar Sep 08 '22 10:09 sayingwhat

cc @mackdk

mbussolotto avatar Sep 08 '22 10:09 mbussolotto

If the problem recurs, can someone add the log info that was requested. Cleaning out the entries in the DB is just dealing with symptoms (and we're not even sure that it's the right thing to do). Since clearing out those rows in the DB, we haven't had it recur and our test system is syncing well on 2022.08.

hbrown-uiowa avatar Sep 08 '22 14:09 hbrown-uiowa

Unfortunately the statements @ddholstad99 did not work for me. Entries were removed, but no tasks running. Version 2022.08

darrynl avatar Oct 04 '22 19:10 darrynl

Running Uyuni 2022.08

I have tried the statements shared by @ddholstad99 , but I am still stuck with the same issue.

uyuni=# select * from qrtz_simple_triggers; trigger_name | trigger_group | repeat_count | repeat_interval | times_triggered | sched_name --------------+---------------+--------------+-----------------+-----------------+------------ (0 rows)

uyuni=# select trigger_name,trigger_group,job_name,trigger_state from qrtz_triggers WHERE trigger_name LIKE '%retry%'; trigger_name | trigger_group | job_name | trigger_state --------------+---------------+----------+--------------- (0 rows)

In Uyuni when clicking on Admin, Task Schedules, I get this: image

I see this in /var/log/rhn/rhn_taskomatic.daemon.log: image

Status of taskomatic service: image

Please let me know if any other logs/information would help.

darrynl avatar Oct 07 '22 20:10 darrynl

@darrynl please follow the instruction here: https://github.com/uyuni-project/uyuni/issues/5556#issuecomment-1167066907 . Moreover, which uyuni version are you using? Is it an upgrade?

mbussolotto avatar Nov 03 '22 14:11 mbussolotto

@darrynl @sysy69 please try to upgrade uyuni and let me know if the issue is still present. If still present, please follow https://github.com/uyuni-project/uyuni/issues/5556#issuecomment-1302198022 . Thanks!

mbussolotto avatar Nov 07 '22 09:11 mbussolotto

@darrynl @sysy69 any update ^^? Thanks :)

mbussolotto avatar Nov 30 '22 13:11 mbussolotto

unfortunately I wasn't able to reproduce the bug and the current logs are not enough to explain the reason of the issue. Since there's no answer for a long time, I suppose the issue has been fix, so I'll close the bug. Feel free to re-open it if you still see the error

mbussolotto avatar Dec 06 '22 08:12 mbussolotto