openhab-core
openhab-core copied to clipboard
Boot failure after cache clearing related to marketplace bundles
System: OH in docker; \addons, \conf, and \userdata bound to external directories. 3.3 release has been stable for weeks.
OH hit a critical flaw when I attempted to upgrade to the 3.4M1 via a new container. The boot sequence froze nearly immediately up startup of the container with the following log:
2022-08-13 08:59:45.458 [INFO ] [org.openhab.ui.internal.UIService ] - Started UI on port 9080
2022-08-13 08:59:46.548 [INFO ] [community.CommunityKarafAddonHandler] - Reinstalling missing marketplace KAR: marketplace:137681
2022-08-13 08:59:46.602 [WARN ] [internal.service.FeaturesServiceImpl] - Can't load features repository mvn:org.openhab.core.features.karaf/org.openhab.core.features.karaf.openhab-core/3.4.0-SNAPSHOT/xml/features
java.lang.RuntimeException: Error resolving artifact org.openhab.core.features.karaf:org.openhab.core.features.karaf.openhab-core:xml:features:3.4.0-SNAPSHOT: [Could not find artifact org.openhab.core.features.karaf:org.openhab.core.features.karaf.openhab-core:xml:features:3.4.0-SNAPSHOT] : mvn:org.openhab.core.features.karaf/org.openhab.core.features.karaf.openhab-core/3.4.0-SNAPSHOT/xml/features
at org.apache.karaf.features.internal.service.RepositoryImpl.load(RepositoryImpl.java:121) ~[?:?]
at org.apache.karaf.features.internal.service.RepositoryImpl.<init>(RepositoryImpl.java:51) ~[?:?]
at org.apache.karaf.features.internal.service.RepositoryCacheImpl.create(RepositoryCacheImpl.java:51) ~[?:?]
at org.apache.karaf.features.internal.service.FeaturesServiceImpl.getFeatureCache(FeaturesServiceImpl.java:611) ~[?:?]
at org.apache.karaf.features.internal.service.FeaturesServiceImpl.ensureCacheLoaded(FeaturesServiceImpl.java:582) ~[?:?]
at org.apache.karaf.features.internal.service.FeaturesServiceImpl.getRepository(FeaturesServiceImpl.java:530) ~[?:?]
at org.apache.karaf.kar.internal.KarServiceImpl.findMissingDependency(KarServiceImpl.java:165) ~[?:?]
at org.apache.karaf.kar.internal.KarServiceImpl.install(KarServiceImpl.java:116) ~[?:?]
at org.apache.karaf.kar.internal.KarServiceImpl.install(KarServiceImpl.java:95) ~[?:?]
at org.openhab.core.addon.marketplace.karaf.internal.community.CommunityKarafAddonHandler.installFromCache(CommunityKarafAddonHandler.java:163) ~[?:?]
at org.openhab.core.addon.marketplace.karaf.internal.community.CommunityKarafAddonHandler.lambda$8(CommunityKarafAddonHandler.java:181) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?]
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[?:?]
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?]
at java.util.Iterator.forEachRemaining(Iterator.java:133) ~[?:?]
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) ~[?:?]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?]
at org.openhab.core.addon.marketplace.karaf.internal.community.CommunityKarafAddonHandler.ensureCachedKarsAreInstalled(CommunityKarafAddonHandler.java:178) ~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: java.io.IOException: Error resolving artifact org.openhab.core.features.karaf:org.openhab.core.features.karaf.openhab-core:xml:features:3.4.0-SNAPSHOT: [Could not find artifact org.openhab.core.features.karaf:org.openhab.core.features.karaf.openhab-core:xml:features:3.4.0-SNAPSHOT]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.configureIOException(AetherBasedResolver.java:803) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:774) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:657) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:598) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:565) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:555) ~[?:?]
at org.ops4j.pax.url.mvn.internal.Connection.getInputStream(Connection.java:123) ~[?:?]
at java.net.URL.openStream(URL.java:1165) ~[?:?]
at org.apache.karaf.features.internal.service.RepositoryImpl.load(RepositoryImpl.java:114) ~[?:?]
... 29 more
Suppressed: shaded.org.eclipse.aether.transfer.ArtifactNotFoundException: Could not find artifact org.openhab.core.features.karaf:org.openhab.core.features.karaf.openhab-core:xml:features:3.4.0-SNAPSHOT
at shaded.org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve(DefaultArtifactResolver.java:403) ~[?:?]
at shaded.org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts(DefaultArtifactResolver.java:215) ~[?:?]
at shaded.org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifact(DefaultArtifactResolver.java:192) ~[?:?]
at shaded.org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveArtifact(DefaultRepositorySystem.java:247) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:767) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:657) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:598) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:565) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:555) ~[?:?]
at org.ops4j.pax.url.mvn.internal.Connection.getInputStream(Connection.java:123) ~[?:?]
at java.net.URL.openStream(URL.java:1165) ~[?:?]
at org.apache.karaf.features.internal.service.RepositoryImpl.load(RepositoryImpl.java:114) ~[?:?]
at org.apache.karaf.features.internal.service.RepositoryImpl.<init>(RepositoryImpl.java:51) ~[?:?]
at org.apache.karaf.features.internal.service.RepositoryCacheImpl.create(RepositoryCacheImpl.java:51) ~[?:?]
at org.apache.karaf.features.internal.service.FeaturesServiceImpl.getFeatureCache(FeaturesServiceImpl.java:611) ~[?:?]
at org.apache.karaf.features.internal.service.FeaturesServiceImpl.ensureCacheLoaded(FeaturesServiceImpl.java:582) ~[?:?]
at org.apache.karaf.features.internal.service.FeaturesServiceImpl.getRepository(FeaturesServiceImpl.java:530) ~[?:?]
at org.apache.karaf.kar.internal.KarServiceImpl.findMissingDependency(KarServiceImpl.java:165) ~[?:?]
at org.apache.karaf.kar.internal.KarServiceImpl.install(KarServiceImpl.java:116) ~[?:?]
at org.apache.karaf.kar.internal.KarServiceImpl.install(KarServiceImpl.java:95) ~[?:?]
at org.openhab.core.addon.marketplace.karaf.internal.community.CommunityKarafAddonHandler.installFromCache(CommunityKarafAddonHandler.java:163) ~[?:?]
at org.openhab.core.addon.marketplace.karaf.internal.community.CommunityKarafAddonHandler.lambda$8(CommunityKarafAddonHandler.java:181) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?]
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[?:?]
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?]
at java.util.Iterator.forEachRemaining(Iterator.java:133) ~[?:?]
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) ~[?:?]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?]
at org.openhab.core.addon.marketplace.karaf.internal.community.CommunityKarafAddonHandler.ensureCachedKarsAreInstalled(CommunityKarafAddonHandler.java:178) ~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: shaded.org.eclipse.aether.resolution.ArtifactResolutionException: Error resolving artifact org.openhab.core.features.karaf:org.openhab.core.features.karaf.openhab-core:xml:features:3.4.0-SNAPSHOT
at shaded.org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve(DefaultArtifactResolver.java:413) ~[?:?]
at shaded.org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts(DefaultArtifactResolver.java:215) ~[?:?]
at shaded.org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifact(DefaultArtifactResolver.java:192) ~[?:?]
at shaded.org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveArtifact(DefaultRepositorySystem.java:247) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:767) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:657) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:598) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:565) ~[?:?]
at org.ops4j.pax.url.mvn.internal.AetherBasedResolver.resolve(AetherBasedResolver.java:555) ~[?:?]
at org.ops4j.pax.url.mvn.internal.Connection.getInputStream(Connection.java:123) ~[?:?]
at java.net.URL.openStream(URL.java:1165) ~[?:?]
at org.apache.karaf.features.internal.service.RepositoryImpl.load(RepositoryImpl.java:114) ~[?:?]
... 29 more
2022-08-13 08:59:46.685 [INFO ] [rketplace.MarketplaceBundleInstaller] - Reinstalling missing marketplace bundle: marketplace:130551
2022-08-13 08:59:46.718 [WARN ] [rketplace.MarketplaceBundleInstaller] - The marketplace bundle was successfully installed but doesn't start: Could not resolve module: org.openhab.binding.samsungtv [233]
Unresolved requirement: Import-Package: org.openhab.core.config.discovery.upnp
2022-08-13 08:59:46.718 [INFO ] [rketplace.MarketplaceBundleInstaller] - Reinstalling missing marketplace bundle: marketplace:137040
2022-08-13 08:59:50.069 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Time zone set to 'America/Phoenix'.
2022-08-13 08:59:50.073 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Location set to '***********,**************'.
2022-08-13 08:59:50.074 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Locale set to 'en_US'.
2022-08-13 08:59:50.074 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Measurement system set to 'US'.
2022-08-13 08:59:50.483 [WARN ] [voice.mimic.internal.MimicTTSService] - Missing URL to access Mimic TTS API. Using localhost
2022-08-13 08:59:50.784 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'mapdb.persist'
2022-08-13 08:59:52.687 [INFO ] [.core.model.lsp.internal.ModelServer] - Started Language Server Protocol (LSP) service on port 5007
2022-08-13 08:59:53.211 [WARN ] [internal.handler.RachioDeviceHandler] - rachio:device:16034df2f2:B8D7AF288C48: ERROR: Initialisation failed
2022-08-13 08:59:53.991 [INFO ] [internal.SolarForecastHandlerFactory] - Persistence null cannot be queried. Feature Solcast Tuninng will not work
2022-08-13 08:59:57.398 [INFO ] [e.automation.internal.RuleEngineImpl] - Rule engine started.
2022-08-13 09:00:00.034 [ERROR] [ipt.internal.ScriptEngineManagerImpl] - ScriptEngine for language 'application/javascript;version=ECMAScript-2021' could not be found for identifier: e9058afc-f875-45aa-97ef-2b201d7bdf76
2022-08-13 09:00:00.090 [ERROR] [ipt.internal.ScriptEngineManagerImpl] - ScriptEngine for language 'application/javascript;version=ECMAScript-2021' could not be found for identifier: b042d898-50a3-44e6-b168-3933bb022921
2022-08-13 09:00:00.104 [ERROR] [ipt.internal.ScriptEngineManagerImpl] - ScriptEngine for language 'application/javascript;version=ECMAScript-2021' could not be found for identifier: dfbce083-2986-4d76-a4f6-3ffa6a0536f0
There is a reference in the forums to an identical issue that occurred after the user just cleared the cache (not a version update): https://community.openhab.org/t/oh3-2-no-start-after-clean-cache/138083
The boot process also corrupted my system: attempting to restart the previously stable 3.3 container resulted in the same error.
I resolved the issue by deleting the contents of the \kar and \bundles folders inside \userdata\marketplace. After that 3.3 container booted back into stable condition and following that 3.4M1 container also successfully booted up.
At the time of the error, I had only three marketplace bindings installed: Solar Forecast PV, Mimic TTS, and Samsung TV. The OP in the linked forum thread also seemed to have issue with the Solar Forecast binding as both that error and my error reference KAR: marketplace:137681, but I don't know if the specific binding is the issue or if it was just the first binding to be impacted by the issue in both cases. Even if the particular binding is the root cause, the cache clearing process should be able to survive a faulty binding as marketplace bindings become more common.
This issue has been mentioned on openHAB Community. There might be relevant details there:
https://community.openhab.org/t/oh3-2-no-start-after-clean-cache/138083/15
It looks like the feature in the KAR of the Solar Forecast PV Marketplace add-on always expects a 3.4.0-SNAPSHOT <repository> so it will not work with any other OH version:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<features xmlns="http://karaf.apache.org/xmlns/features/v1.6.0" name="org.openhab.binding.solarforecast-3.4.0-SNAPSHOT">
<repository>mvn:org.openhab.core.features.karaf/org.openhab.core.features.karaf.openhab-core/3.4.0-SNAPSHOT/xml/features</repository>
<feature name="openhab-binding-solarforecast" description="SolarForecast Binding" version="3.4.0.SNAPSHOT">
<feature>openhab-runtime-base</feature>
<bundle start-level="80">mvn:org.openhab.addons.bundles/org.openhab.binding.solarforecast/3.4.0-SNAPSHOT</bundle>
</feature>
<feature name="org.openhab.binding.solarforecast" description="openHAB Add-ons :: Bundles :: SolarForecast Binding" version="3.4.0.SNAPSHOT">
<details>This project contains the official add-ons of openHAB</details>
<feature prerequisite="true" dependency="false">wrap</feature>
<bundle start-level="80">mvn:org.json/json/20180813</bundle>
<bundle start-level="80">wrap:mvn:org.lastnpe.eea/eea-all/2.2.1</bundle>
</feature>
</features>
Perhaps @weymann can change the Marketplace topic so only the JAR is used (as suggested in the comment)?
I removed kar file from Marketplace but I cannot explain what is causing the problem. I'm building jar & kar against latest openhab-addons repo without referecing a certain snapshot. Both files are working in my OH3.3 test environment but for sure I didn't run an update process or cleaned cache so far which obviously causing some type of problem. But I don't kown what to fix in order to have a proper kar file created.
We probably still need a bit of tooling so it becomes easier to generate more usable KAR files.
IIRC it will work better if the embedded feature XML (/repository/org/openhab/addons/bundles/org.openhab.binding.solarforecast/3.4.0-SNAPSHOT/org.openhab.binding.solarforecast-3.4.0-SNAPSHOT-features.xml) would be:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<features xmlns="http://karaf.apache.org/xmlns/features/v1.6.0" name="org.openhab.binding.solarforecast-3.4.0-SNAPSHOT">
SNAPSHOT/xml/features</repository>
<feature name="openhab-binding-solarforecast" description="SolarForecast Binding" version="3.4.0.SNAPSHOT">
<feature>openhab-runtime-base</feature>
<bundle start-level="80">mvn:org.openhab.addons.bundles/org.openhab.binding.solarforecast/3.4.0-SNAPSHOT</bundle>
</feature>
</features>
...and it shouldn't add those "org.json:json" and "org.lastnpe.eea:eea-all" JARs to the KAR. I made some similar manual changes in the LOG4J2 Extra KAR.
This still doesn't address the issue that one malformed marketplace kar file causes OH to fail catastrophically. Is there some way to have the marketplace bindings fail gracefully so that under similar circumstances (which are surely going to crop up again with something open like the marketplace) OH continues the cache restore and booting process?
Do you by chance still have the .kar? In the marketplace it's replaced by a .jar and I would like to check if we can harden the marketplace code.
Just removed the link from the Marketplace posting. It's still available in my repo
I think what happens here is not a "catastrophic failure". An exception is logged (from, Karaf, we can't do anything against that) and then everything continues to work. There might be an additional issue with installing other add-ons, caused by #3025.
I think what happens here is not a "catastrophic failure". An exception is logged (from, Karaf, we can't do anything against that) and then everything continues to work
That's just it: everything did not continue to work; nothing continued to work. The log I posted above is not a snippet; that's the entire log up until I killed the container after it have been stuck like that for a long time. None of the failed boot attempts ever showed any log entries beyond that no matter how long I left them running. Nothing else gets loaded or registered, no rules run. Either all OH core functions failed in a cascade (less likely as other errors would surely have been logged) or the whole boot sequence was completely hung on one failed process. Even the typical shutdown logs that occur when stopping the container did not get posted.
The UI was started before the error, of course, so I could open the overview page, but it showed nothing as there are no things or items or rules. Further evidence that no activity was occurring is that all persisted items show a full gap from my first upgrade attempt until after I resolved the issue by removing the kar files (I have mapdb configured for item restore and use the default rrd4j for everything else). During the four or five boot attempts (some 3.3, some 3.4M1) not a single item state was persisted (even during the one I left running for a while).
As soon as the marketplace files were removed from the system, the next boot attempt was normal as were all subsequent attempts.