nexus-public
nexus-public copied to clipboard
OutOfMemory with grouped YUM repos
We are runnung Nexus OSS for quite some time and are using the official images from docker hub. We try to upgrade from version 3.38.1 to the latest version (3.58.1-02) but encounter serious problems: Even with almost zero usage, after a short time nexus hangs with 100% CPU and "java.lang.OutOfMemoryError: GC overhead limit exceeded" messages in the log.
We are using the original Nexus dockerhub image with JVM defaults and tried to increase JVM Memory, but this doesn't help. The affected Nexus instances act only as proxy cache for a docker repo and for several CentOS7 Mirrors.
The problem seems to be the grouping of CentOS repos base+updates+extras+debuginfo to one group. Even grouping base+updates doesn't seem to work.
We see always elastic/jvm statistics in the logs about the growing heap and usually one or two "Cooperative wait" exceptions:
2023-08-01 09:44:48,070+0000 INFO [elasticsearch[97173A06-2A9ED772-BD75FF73-431D2AA4-1B07B29F][scheduler][T#1]] *SYSTEM org.elasticsearch.monitor.jvm - [97173A06-2A9ED772-BD75FF73-431D2AA4-1B07B29F] [gc][old][2320][7] duration [7.3s], collections [1]/[8.6s], total [7.3s]/[14s], memory [1.4gb]->[1.5gb]/[2.3gb], all_pools {[young] [560.5mb]->[7.2mb]/[559.5mb]}{[survivor] [0b]->[0b]/[41.5mb]}{[old] [899.8mb]->[1.5gb]/[1.7gb]} 2023-08-01 09:44:56,562+0000 WARN [qtp837760789-468] *UNKNOWN org.sonatype.nexus.repository.httpbridge.internal.ViewServlet - Failure servicing: GET /repository/yum-centos7-group/repodata/repomd.xml org.sonatype.nexus.common.io.CooperationException: Cooperative wait timed out on repodata/repomd.xml (2 threads cooperating) at org.sonatype.nexus.common.io.CooperatingFuture.waitForCall(CooperatingFuture.java:167) at org.sonatype.nexus.common.io.CooperatingFuture.cooperate(CooperatingFuture.java:81) at org.sonatype.nexus.common.io.ScopedCooperationFactorySupport$ScopedCooperation.cooperate(ScopedCooperationFactorySupport.java:106) at org.sonatype.nexus.repository.yum.orient.internal.group.OrientYumGroupFacetImpl.buildRepomd(OrientYumGroupFacetImpl.java:176) at org.sonatype.nexus.repository.yum.internal.group.YumGroupFacet$buildRepomd$0.call(Unknown Source) at org.sonatype.nexus.repository.yum.internal.group.YumAbstractGroupHandler.buildMergedRepomd(YumAbstractGroupHandler.groovy:131)
From that point on it is only a matter of seconds that the whole heap is used up and Nexus even fails to respond to any requests, be it API oder GUI.
Since we didn't face the problem with the older version (3.38) this could be due to a rather recent code change, as suggested here: https://community.sonatype.com/t/yum-times-out-when-pointing-to-a-group-repository/9973/5 Keyword seems to be YumGroupMergerImpl
I am not sure if this is YUM specific or may affect other repo types too: https://community.sonatype.com/t/updating-to-nexus-3-23-0-causes-100-cpu-load-with-gc/4137/3
Maybe unrelated is this issue - but who knows: https://github.com/sonatype/nexus-public/issues/204
I noticed that the problem is actually even worse - even a grouping of https://packages.microsoft.com/rhel/7/prod/ and https://packages.microsoft.com/yumrepos/azure-cli/ YUM repos doesn't work anymore with JVM defaults and will kill your nexus instance as soon as a client requests the yum indices, triggering the index merges.
That's bad, yum grouping is completly broken now - so it' seems.
Hi @alexander-vie , thanks a lot for taking the time to bring this to our awareness. We'll look into this further.
We are affected by the same issue after upgrading nexus. Is there any workaround besides not using grouped yum repositories ?
We are affected by the same issue after upgrading nexus. Is there any workaround besides not using grouped yum repositories ?
I don't think so - at least, it seems that only the index merge of grouped repos is affected, not the index creation of hosted repos itself. Maybe grouping of yum repos isn't widly used (apt repos can't be grouped at all), otherwise this issue would gain more visibility.