netopeer2 icon indicating copy to clipboard operation
netopeer2 copied to clipboard

Netopeer2-server startup race condition

Open mkuklews1 opened this issue 5 months ago • 4 comments

Hi,

I'm using Netopeer2 version 2.4.1. I noticed an issue when executing some python tests, where I'm restarting Netopeer2-server often and somehow fast.

The problem is that SSH key authentication would fail in case if the Call-Home is initiated by the client before the Netopeer2-server is started. If I ensure the server is started before the client starts Call-Home, then the key authentication is successful. Here is the log from faulty case:

daemon.info netopeer2-server[7688]: Connection 1352 created.
daemon.info netopeer2-server[7688]: Listening on /var/run/netopeer2-server.sock for UNIX connections.
daemon.info netopeer2-server[7688]: Triggering "ietf-netconf-server" "done" event on enabled data.
daemon.info netopeer2-server[7688]: Listening on 0.0.0.0:830 for SSH connections.
daemon.info netopeer2-server[7688]: Call Home client "default-client" endpoint "default-ssh" connecting...
daemon.info netopeer2-server[7688]: Trying to connect via IPv4 to [remote_ip]:4334.
daemon.info netopeer2-server[7688]: Successfully connected to [remote_ip]:4334 over IPv4.
daemon.err netopeer2-server[7688]: Keystore entry "genkey" not found.
daemon.info netopeer2-server[7688]: Triggering "ietf-keystore" "done" event on enabled data.
daemon.info netopeer2-server[7688]: Triggering "ietf-truststore" "done" event on enabled data.
daemon.info netopeer2-server[7688]: Triggering "libnetconf2-netconf-server" "done" event on enabled data.
daemon.info netopeer2-server[7688]: Triggering "ietf-netconf-acm" "done" event on enabled data.
daemon.info netopeer2-server[7688]: Triggering "ietf-netconf-acm" "done" event on enabled data.
daemon.info netopeer2-server[7688]: Triggering "ietf-netconf-acm" "done" event on enabled data.
daemon.info netopeer2-server[7688]: Triggering "ietf-netconf-acm" "done" event on enabled data.
daemon.info netopeer2-server[7688]: Server terminated.
daemon.info netopeer2-server[7688]: Call Home thread signaled to exit, client "default-client" probably removed.
daemon.info netopeer2-server[7688]: Call Home client "default-client" thread exit.
daemon.info netopeer2-server[7688]: Connection 1352 destroyed.

The minimal testcase to reproduce that issue would be following:

  1. Start Netopeer2-cli, configure it to use SSH key authentication and execute listen command
  2. Start Netopeer2-server
  3. You should have the above mentioned Keystore entry "genkey" not found

From my investigation it looks like the connection is getting accepted and the server is trying to get "genkey" from keystore before it was populated with data from ietf-keystore. The problem is that server socket handling starts when this event happens Triggering "ietf-netconf-server" "done" event on enabled data. and the keystore is populated with data when this happens Triggering "ietf-keystore" "done" event on enabled data.

I also made a small test where I changed that part from main.c file:

    /*
     * ietf-netconf-server, ietf-keystore, ietf-trustore, and libnetconf2-netconf-server handled by ln2
     */
    SR_CONFIG_SUBSCR("ietf-netconf-server", NULL, np2srv_libnetconf2_config_cb);
    SR_CONFIG_SUBSCR("ietf-keystore", NULL, np2srv_libnetconf2_config_cb);
    SR_CONFIG_SUBSCR("ietf-truststore", NULL, np2srv_libnetconf2_config_cb);
    SR_CONFIG_SUBSCR("libnetconf2-netconf-server", NULL, np2srv_libnetconf2_config_cb);

to this:

    /*
     * ietf-netconf-server, ietf-keystore, ietf-trustore, and libnetconf2-netconf-server handled by ln2
     */
    SR_CONFIG_SUBSCR("ietf-keystore", NULL, np2srv_libnetconf2_config_cb);
    SR_CONFIG_SUBSCR("ietf-truststore", NULL, np2srv_libnetconf2_config_cb);
    SR_CONFIG_SUBSCR("ietf-netconf-server", NULL, np2srv_libnetconf2_config_cb);
    SR_CONFIG_SUBSCR("libnetconf2-netconf-server", NULL, np2srv_libnetconf2_config_cb);

which is fixing the issue.

For reference my configuration looks like this:

<netconf-server xmlns="urn:ietf:params:xml:ns:yang:ietf-netconf-server">
    <listen>
        <endpoints>
            <endpoint>
                <name>default-ssh</name>
                    <ssh>
                        <tcp-server-parameters>
                            <local-address>0.0.0.0</local-address>
                            <keepalives>
                                <idle-time>1</idle-time>
                                <max-probes>10</max-probes>
                                <probe-interval>5</probe-interval>
                            </keepalives>
                        </tcp-server-parameters>
                        <ssh-server-parameters>
                            <server-identity>
                                <host-key>
                                    <name>default-key</name>
                                    <public-key>
                                        <central-keystore-reference>genkey</central-keystore-reference>
                                    </public-key>
                                </host-key>
                            </server-identity>
                            <client-authentication/>
                        </ssh-server-parameters>
                    </ssh>
            </endpoint>
        </endpoints>
    </listen>
    <call-home>
        <netconf-client>
            <name>default-client</name>
            <endpoints>
                <endpoint>
                    <name>default-ssh</name>
                    <ssh>
                        <tcp-client-parameters>
                            <remote-address>[remote_ip]</remote-address>
                            <keepalives>
                                <idle-time>1</idle-time>
                                <max-probes>10</max-probes>
                                <probe-interval>5</probe-interval>
                            </keepalives>
                        </tcp-client-parameters>
                        <ssh-server-parameters>
                            <server-identity>
                                <host-key>
                                    <name>default-key</name>
                                    <public-key>
                                        <central-keystore-reference>genkey</central-keystore-reference>
                                    </public-key>
                                </host-key>
                            </server-identity>
                            <client-authentication>
                                <endpoint-reference xmlns="urn:cesnet:libnetconf2-netconf-server">default-ssh</endpoint-reference>
                            </client-authentication>
                        </ssh-server-parameters>
                    </ssh>
                </endpoint>
            </endpoints>
            <connection-type>
                <persistent/>
            </connection-type>
        </netconf-client>
    </call-home>
</netconf-server>

mkuklews1 avatar Jun 05 '25 10:06 mkuklews1