nexus-public icon indicating copy to clipboard operation
nexus-public copied to clipboard

OutOfMemory with grouped YUM repos

Open alexander-vie opened this issue 1 year ago • 4 comments

We are runnung Nexus OSS for quite some time and are using the official images from docker hub. We try to upgrade from version 3.38.1 to the latest version (3.58.1-02) but encounter serious problems: Even with almost zero usage, after a short time nexus hangs with 100% CPU and "java.lang.OutOfMemoryError: GC overhead limit exceeded" messages in the log.

We are using the original Nexus dockerhub image with JVM defaults and tried to increase JVM Memory, but this doesn't help. The affected Nexus instances act only as proxy cache for a docker repo and for several CentOS7 Mirrors.

The problem seems to be the grouping of CentOS repos base+updates+extras+debuginfo to one group. Even grouping base+updates doesn't seem to work.

We see always elastic/jvm statistics in the logs about the growing heap and usually one or two "Cooperative wait" exceptions:

2023-08-01 09:44:48,070+0000 INFO [elasticsearch[97173A06-2A9ED772-BD75FF73-431D2AA4-1B07B29F][scheduler][T#1]] *SYSTEM org.elasticsearch.monitor.jvm - [97173A06-2A9ED772-BD75FF73-431D2AA4-1B07B29F] [gc][old][2320][7] duration [7.3s], collections [1]/[8.6s], total [7.3s]/[14s], memory [1.4gb]->[1.5gb]/[2.3gb], all_pools {[young] [560.5mb]->[7.2mb]/[559.5mb]}{[survivor] [0b]->[0b]/[41.5mb]}{[old] [899.8mb]->[1.5gb]/[1.7gb]} 2023-08-01 09:44:56,562+0000 WARN [qtp837760789-468] *UNKNOWN org.sonatype.nexus.repository.httpbridge.internal.ViewServlet - Failure servicing: GET /repository/yum-centos7-group/repodata/repomd.xml org.sonatype.nexus.common.io.CooperationException: Cooperative wait timed out on repodata/repomd.xml (2 threads cooperating) at org.sonatype.nexus.common.io.CooperatingFuture.waitForCall(CooperatingFuture.java:167) at org.sonatype.nexus.common.io.CooperatingFuture.cooperate(CooperatingFuture.java:81) at org.sonatype.nexus.common.io.ScopedCooperationFactorySupport$ScopedCooperation.cooperate(ScopedCooperationFactorySupport.java:106) at org.sonatype.nexus.repository.yum.orient.internal.group.OrientYumGroupFacetImpl.buildRepomd(OrientYumGroupFacetImpl.java:176) at org.sonatype.nexus.repository.yum.internal.group.YumGroupFacet$buildRepomd$0.call(Unknown Source) at org.sonatype.nexus.repository.yum.internal.group.YumAbstractGroupHandler.buildMergedRepomd(YumAbstractGroupHandler.groovy:131)

From that point on it is only a matter of seconds that the whole heap is used up and Nexus even fails to respond to any requests, be it API oder GUI.

Since we didn't face the problem with the older version (3.38) this could be due to a rather recent code change, as suggested here: https://community.sonatype.com/t/yum-times-out-when-pointing-to-a-group-repository/9973/5 Keyword seems to be YumGroupMergerImpl

I am not sure if this is YUM specific or may affect other repo types too: https://community.sonatype.com/t/updating-to-nexus-3-23-0-causes-100-cpu-load-with-gc/4137/3

Maybe unrelated is this issue - but who knows: https://github.com/sonatype/nexus-public/issues/204

alexander-vie avatar Aug 01 '23 11:08 alexander-vie

I noticed that the problem is actually even worse - even a grouping of https://packages.microsoft.com/rhel/7/prod/ and https://packages.microsoft.com/yumrepos/azure-cli/ YUM repos doesn't work anymore with JVM defaults and will kill your nexus instance as soon as a client requests the yum indices, triggering the index merges.

That's bad, yum grouping is completly broken now - so it' seems.

alexander-vie avatar Aug 01 '23 15:08 alexander-vie

Hi @alexander-vie , thanks a lot for taking the time to bring this to our awareness. We'll look into this further.

gracecllee avatar Aug 01 '23 20:08 gracecllee

We are affected by the same issue after upgrading nexus. Is there any workaround besides not using grouped yum repositories ?

JSurf avatar Aug 24 '23 08:08 JSurf

We are affected by the same issue after upgrading nexus. Is there any workaround besides not using grouped yum repositories ?

I don't think so - at least, it seems that only the index merge of grouped repos is affected, not the index creation of hosted repos itself. Maybe grouping of yum repos isn't widly used (apt repos can't be grouped at all), otherwise this issue would gain more visibility.

alexander-vie avatar Aug 25 '23 18:08 alexander-vie