pki icon indicating copy to clipboard operation
pki copied to clipboard

pkispawn fails against 389-ds 1.4.3.19

Open stanislavlevin opened this issue 4 years ago • 16 comments

As the part of IPA master installation, pkispawn fails with:

[2021-02-12 08:55:49] [ipatests.pytest_ipa.integration.host.Host.master1.cmd26] Failed to configure CA instance: CalledProcessError(Command ['/usr/sbin/pkispawn', '-s', 'CA', '-f', '/tmp/.private/root/tmp9esbgkf1', '--debug'] returned non-zero exit status 1
2021-02-12 08:55:46 [main] INFO: CMSEngine: initializing password store
2021-02-12 08:55:46 [main] INFO: CMSEngine: initializing password store for internaldb
2021-02-12 08:55:46 [main] INFO: CMSEngine: verifying connection to master1.ipa.test:389 as cn=Directory Manager
2021-02-12 08:55:47 [main] INFO: CMSEngine: initializing password store for replicationdb
2021-02-12 08:55:47 [main] INFO: CMSEngine: verifying connection to master1.ipa.test:389 as cn=Replication Manager masterAgreement1-master1.ipa.test-pki-tomcat,cn=config
2021-02-12 08:55:47 [main] INFO: CMSEngine: verifying connection to master1.ipa.test:389 as cn=Replication Manager masterAgreement1-master1.ipa.test-pki-tomcat,cn=config
2021-02-12 08:55:48 [main] INFO: CMSEngine: verifying connection to master1.ipa.test:389 as cn=Replication Manager masterAgreement1-master1.ipa.test-pki-tomcat,cn=config
2021-02-12 08:55:48 [main] SEVERE: CMSEngine: password test execution failed: 2
2021-02-12 08:55:48 [main] SEVERE: Unable to start CA engine: Password test execution failed. Is the database up?
Password test execution failed. Is the database up?
	at com.netscape.cmscore.apps.CMSEngine.initPasswordStore(CMSEngine.java:426)
	at com.netscape.cmscore.apps.CMSEngine.start(CMSEngine.java:997)
	at com.netscape.cmscore.apps.CMSEngine.contextInitialized(CMSEngine.java:1643)
	at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4678)
	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5139)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:717)
	at org.apache.catalina.core.ContainerBase.access$000(ContainerBase.java:129)
	at org.apache.catalina.core.ContainerBase$PrivilegedAddChild.run(ContainerBase.java:150)
	at org.apache.catalina.core.ContainerBase$PrivilegedAddChild.run(ContainerBase.java:140)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:688)
	at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:705)
	at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:631)
	at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1830)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
	at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:526)
	at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:425)
	at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1576)
	at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:309)
	at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:123)
	at org.apache.catalina.util.LifecycleBase.setStateInternal(LifecycleBase.java:423)
	at org.apache.catalina.util.LifecycleBase.setState(LifecycleBase.java:366)
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:936)
	at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:841)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1384)
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1374)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:909)
	at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:262)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.core.StandardService.startInternal(StandardService.java:421)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:930)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.startup.Catalina.start(Catalina.java:738)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:342)
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:473)

2021-02-12 08:55:48 [main] INFO: Shutting down CA subsystem
2021-02-12 08:55:48 [main] INFO: RequestSubsystem: Request subsystem stopped
2021-02-12 08:55:48 [main] SEVERE: Exception sending context initialized event to listener instance of class [org.dogtagpki.server.ca.CAEngine]
java.lang.RuntimeException: Unable to start CA engine: Password test execution failed. Is the database up?
	at com.netscape.cmscore.apps.CMSEngine.contextInitialized(CMSEngine.java:1648)
	at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4678)
	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5139)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:717)
	at org.apache.catalina.core.ContainerBase.access$000(ContainerBase.java:129)
	at org.apache.catalina.core.ContainerBase$PrivilegedAddChild.run(ContainerBase.java:150)
	at org.apache.catalina.core.ContainerBase$PrivilegedAddChild.run(ContainerBase.java:140)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:688)
	at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:705)
	at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:631)
	at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1830)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
	at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:526)
	at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:425)
	at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1576)
	at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:309)
	at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:123)
	at org.apache.catalina.util.LifecycleBase.setStateInternal(LifecycleBase.java:423)
	at org.apache.catalina.util.LifecycleBase.setState(LifecycleBase.java:366)
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:936)
	at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:841)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1384)
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1374)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:909)
	at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:262)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.core.StandardService.startInternal(StandardService.java:421)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:930)
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
	at org.apache.catalina.startup.Catalina.start(Catalina.java:738)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:342)
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:473)
Caused by: Password test execution failed. Is the database up?
	at com.netscape.cmscore.apps.CMSEngine.initPasswordStore(CMSEngine.java:426)
	at com.netscape.cmscore.apps.CMSEngine.start(CMSEngine.java:997)
	at com.netscape.cmscore.apps.CMSEngine.contextInitialized(CMSEngine.java:1643)
	... 45 more

2021-02-12 08:55:48 [main] INFO: Shutting down CA subsystem
2021-02-12 08:55:48 [main] INFO: RequestSubsystem: Request subsystem stopped

The failure is related to recent change in 389-ds https://github.com/389ds/389-ds-base/issues/4609 Previously, testLDAPConnection received LDAPException.NO_SUCH_OBJECT, but this was changed to LDAPException.INVALID_CREDENTIALS.

stanislavlevin avatar Feb 12 '21 10:02 stanislavlevin

this helps for 389-ds having old and new behavior:

diff --git a/pki-core/base/server/src/com/netscape/cmscore/apps/CMSEngine.java b/pki-core/base/server/src/com/netscape/cmscore/apps/CMSEngine.java
index e89a691d2..e652e1c3b 100644
--- a/pki-core/base/server/src/com/netscape/cmscore/apps/CMSEngine.java
+++ b/pki-core/base/server/src/com/netscape/cmscore/apps/CMSEngine.java
@@ -169,9 +169,8 @@ public class CMSEngine implements ServletContextListener {
 
     private static final int PW_OK =0;
     //private static final int PW_BAD_SETUP = 1;
-    private static final int PW_INVALID_PASSWORD = 2;
+    private static final int PW_INVALID_CREDENTIALS = 2;
     private static final int PW_CANNOT_CONNECT = 3;
-    private static final int PW_NO_USER = 4;
     private static final int PW_MAX_ATTEMPTS = 3;
 
 
@@ -402,16 +401,16 @@ public class CMSEngine implements ServletContextListener {
             }
 
             int iteration = 0;
-            int result = PW_INVALID_PASSWORD;
+            int result = PW_INVALID_CREDENTIALS;
 
             do {
                 String passwd = mPasswordStore.getPassword(tag, iteration);
                 result = testLDAPConnection(tag, connInfo, binddn, passwd);
                 iteration++;
-            } while ((result == PW_INVALID_PASSWORD) && (iteration < PW_MAX_ATTEMPTS));
+            } while ((result == PW_INVALID_CREDENTIALS) && (iteration < PW_MAX_ATTEMPTS));
 
             if (result != PW_OK) {
-                if ((result == PW_NO_USER) && (tag.equals("replicationdb"))) {
+                if ((result == PW_INVALID_CREDENTIALS) && (tag.equals("replicationdb"))) {
                     logger.warn(
                         "CMSEngine: password test execution failed for replicationdb " +
                         "with NO_SUCH_USER. This may not be a latest instance. Ignoring ..");
@@ -434,7 +433,7 @@ public class CMSEngine implements ServletContextListener {
         int ret = PW_OK;
 
         if (StringUtils.isEmpty(pwd)) {
-            return PW_INVALID_PASSWORD;
+            return PW_INVALID_CREDENTIALS;
         }
 
         String host = info.getHost();
@@ -455,12 +454,9 @@ public class CMSEngine implements ServletContextListener {
 
             switch (e.getLDAPResultCode()) {
             case LDAPException.NO_SUCH_OBJECT:
-                logger.debug("CMSEngine: user does not exist: " + binddn);
-                ret = PW_NO_USER;
-                break;
             case LDAPException.INVALID_CREDENTIALS:
-                logger.debug("CMSEngine: invalid password");
-                ret = PW_INVALID_PASSWORD;
+                logger.debug("CMSEngine: invalid credentials");
+                ret = PW_INVALID_CREDENTIALS;
                 break;
             default:
                 logger.debug("CMSEngine: unable to connect to " + name + ": " + e.getMessage());

stanislavlevin avatar Feb 12 '21 11:02 stanislavlevin

Actually, NO_SUCH_OBJECT is used in:

base/acme/src/main/java/org/dogtagpki/acme/database/LDAPDatabase.java:                e.getLDAPResultCode() != LDAPException.NO_SUCH_OBJECT
base/acme/src/main/java/org/dogtagpki/acme/database/LDAPDatabase.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/ca/src/com/netscape/cms/servlet/admin/PublisherAdminServlet.java:                    if (ex.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/common/src/main/java/com/netscape/certsrv/ldap/LDAPExceptionConverter.java:        case LDAPException.NO_SUCH_OBJECT:
base/server/python/pki/server/deployment/pkiparser.py:        except ldap.NO_SUCH_OBJECT:
base/server/python/pki/server/pkispawn.py:        except ldap.NO_SUCH_OBJECT:
base/server/src/com/netscape/cms/authentication/DirBasedAuthentication.java:            case LDAPException.NO_SUCH_OBJECT:
base/server/src/com/netscape/cms/authentication/PortalEnroll.java:            case LDAPException.NO_SUCH_OBJECT:
base/server/src/com/netscape/cms/authentication/UidPwdDirAuthentication.java:            case LDAPException.NO_SUCH_OBJECT:
base/server/src/com/netscape/cms/authentication/UidPwdPinDirAuthentication.java:            case LDAPException.NO_SUCH_OBJECT:
base/server/src/com/netscape/cms/authentication/UserPwdDirAuthentication.java:            case LDAPException.NO_SUCH_OBJECT:
base/server/src/com/netscape/cms/authorization/DirAclAuthz.java:            if (engine.isPreOpMode() && e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cms/publish/mappers/LdapCaSimpleMap.java:            } else if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT && mCreateCAEntry) {
base/server/src/com/netscape/cms/publish/mappers/LdapEnhancedMap.java:                    LDAPException.NO_SUCH_OBJECT && mCreateEntry) {
base/server/src/com/netscape/cms/servlet/cert/RemoteAuthConfig.java:                 case LDAPException.NO_SUCH_OBJECT:
base/server/src/com/netscape/cms/servlet/cert/RemoteAuthConfig.java:                    case LDAPException.NO_SUCH_OBJECT:
base/server/src/com/netscape/cms/servlet/csadmin/LDAPConfigurator.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cms/servlet/csadmin/LDAPConfigurator.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cms/servlet/csadmin/LDAPConfigurator.java:                if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT && ignoreErrors) {
base/server/src/com/netscape/cms/servlet/csadmin/SecurityDomainProcessor.java:            if (resultCode != LDAPException.NO_SUCH_OBJECT && resultCode != LDAPException.NO_SUCH_ATTRIBUTE) {
base/server/src/com/netscape/cms/servlet/csadmin/SecurityDomainProcessor.java:            if (resultCode != LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cmscore/base/LDAPConfigStore.java:                if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cmscore/dbs/DBSSession.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cmscore/dbs/DBSSession.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT)
base/server/src/com/netscape/cmscore/dbs/DBSubsystem.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cmscore/profile/LDAPProfileSubsystem.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cmscore/session/LDAPSecurityDomainSessionTable.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/server/src/com/netscape/cmscore/session/LDAPSecurityDomainSessionTable.java:            case LDAPException.NO_SUCH_OBJECT:
base/server/src/com/netscape/cmscore/usrgrp/UGSubsystem.java:            if (e.getLDAPResultCode() == LDAPException.NO_SUCH_OBJECT) {
base/tps-client/src/authentication/LDAP_Authentication.cpp:            if (ldap_status != LDAP_NO_SUCH_OBJECT) {

some of them are related to authentication and some not.

stanislavlevin avatar Feb 15 '21 18:02 stanislavlevin

@edewata, any thoughts?

stanislavlevin avatar Feb 15 '21 18:02 stanislavlevin

@stanislavlevin Have the DS changes been publicly released, or is there a DS COPR build that we can test against?

edewata avatar Feb 15 '21 21:02 edewata

@stanislavlevin We have CI tests defined as GH actions: https://github.com/dogtagpki/pki/tree/master/.github/workflows

Currently it will install the official DS from Fedora, for example: https://github.com/dogtagpki/pki/blob/master/.github/workflows/ca-tests.yml#L94

If you have an unreleased DS COPR build and you want to test your PKI changes, you can enable the COPR repo in the test, push the changes to your GH repo, and wait for the tests to complete.

edewata avatar Feb 15 '21 21:02 edewata

According to the DS issue I provided above the released versions included this change:

  • 389-ds-base-2.0.3
  • 389-ds-base-1.4.4.13
  • 389-ds-base-1.4.3.19

If talk about Fedora, then

Release Stable version Version in testing
Fedora 35    
Fedora 34 389-ds-base-2.0.3-1.fc34  
Fedora 33 389-ds-base-1.4.4.12-1.fc33  
Fedora 32 389-ds-base-1.4.3.18-1.fc32

So, only F34 is affected right now.

Also this was caught in https://github.com/freeipa-pr-ci2/freeipa/pull/714, as @flo-renaud mentioned

stanislavlevin avatar Feb 16 '21 07:02 stanislavlevin

@edewata this is, unfortunately, already in nightly RHEL 8.4. We need this issue fixed as it blocks RHEL IdM too.

abbra avatar Feb 16 '21 09:02 abbra

@stanislavlevin IPA affected by CMSEngine.initPasswordStore() where replicationdb credentials token is initialized regardless whether it is needed or not. We don't use replicationdb in any IPA workflow so a potential fix for IPA would be to refactor initPasswordStore to not check replicationdb creds when they aren't needed.

abbra avatar Feb 16 '21 09:02 abbra

Also, this fails in Fedora 34 OpenQA tests: https://openqa.fedoraproject.org/tests/778182#dependencies Specific step screenshot: https://openqa.fedoraproject.org/tests/778182#step/role_deploy_domain_controller/33 Here is a video of the installation process: https://openqa.fedoraproject.org/tests/778182/video?filename=video.ogv, fails at around 40th second

abbra avatar Feb 16 '21 10:02 abbra

@abbra, my proposal is the general refactoring pattern which will work in either case.

I'm afraid that DS change has many hidden bugs in dependent projects. Just for example, ldapjdk examples reference NO_SUCH_OBJECT in case not existed user. https://github.com/dogtagpki/ldap-sdk/blob/b2043cacaf5e729258a7c4508549d0cfa176c439/java-sdk/ietfldap/org/ietf/ldap/LDAPConnection.java#L895

stanislavlevin avatar Feb 16 '21 10:02 stanislavlevin

I don't think all NO_SUCH_OBJECT usages are affected. This is a very specific case: you are doing an LDAP BIND and would no longer receive NO_SUCH_OBJECT.

AFAIK, it affects the following places:

  • base/ca/src/com/netscape/cms/servlet/admin/PublisherAdminServlet.java
  • base/server/src/com/netscape/cms/authentication/PortalEnroll.java
  • base/server/src/com/netscape/cms/authentication/UidPwdDirAuthentication.java
  • base/server/src/com/netscape/cms/authentication/UidPwdPinDirAuthentication.java
  • base/server/src/com/netscape/cms/authentication/UserPwdDirAuthentication.java
  • base/server/src/com/netscape/cms/servlet/csadmin/SecurityDomainProcessor.java
  • base/server/src/com/netscape/cmscore/apps/CMSEngine.java
  • base/tps-client/src/authentication/LDAP_Authentication.cpp (may be)

abbra avatar Feb 16 '21 10:02 abbra

Just FYI, @jmagne is working on this.

edewata avatar Feb 18 '21 00:02 edewata

Since PR https://github.com/dogtagpki/pki/pull/3465 is merged already and new PKI version is released, Fedora updates built and successful in OpenQA, we can close this issue.

Fedora updates:

  • F32: https://bodhi.fedoraproject.org/updates/FEDORA-2021-dc1a4934a5
  • F33: https://bodhi.fedoraproject.org/updates/FEDORA-2021-7458e2d835
  • F34: https://bodhi.fedoraproject.org/updates/FEDORA-2021-263244c071
  • F35: https://bodhi.fedoraproject.org/updates/FEDORA-2021-c95b836c2f

abbra avatar Feb 27 '21 09:02 abbra

Original issue has gone, but the potential ones we talked about are not fixed. @jmagne, thank you!

stanislavlevin avatar Mar 01 '21 15:03 stanislavlevin

@stanislavlevin - is this issue now resolved? There was lots of activity but nothing in 6 months?

ckelleyRH avatar Apr 13 '22 07:04 ckelleyRH

@ckelleyRH, the original issue was fixed, but some related issues (not exposed yet?) may be here (per https://github.com/dogtagpki/pki/issues/3458#issuecomment-779755566).

stanislavlevin avatar Apr 13 '22 08:04 stanislavlevin