atalk-android icon indicating copy to clipboard operation
atalk-android copied to clipboard

OMEMO session for device id -1

Open licaon-kter opened this issue 2 years ago • 7 comments

aTalk 2.9.5/2.9.6 F-Droid / Converse https://github.com/conversejs/converse.js/commit/0df1c1880e0e3f48f70ee9f8edf7b97581ea18b5 Fennec

Steps:

  • create new user (via ejabberd server cli)
  • login in Converse.js
  • relogin several times until OMEMO works fine (ref: https://github.com/conversejs/converse.js/issues/2985)
  • start aTalk
  • login in aTalk with THE SAME user as Converse
  • atalk: send messages to contact, ok
  • messages appear as OMEMO errors in Converse, then disappear
  • aTalk Exit
  • Converse logout
  • Converse login
  • aTalk login
  • Converse send message to contact, error: https://gist.github.com/licaon-kter/a3742147a9508aa931e81d21468a63d6
  • aTalk send message to contact, error: atalkerror

Server side DB (notice first device id number?):

ejabberd=# SELECT * FROM pubsub_node WHERE host = '[email protected]' ORDER BY node;
       host       |                       node                       | parent | plugin | nodeid 
------------------+--------------------------------------------------+--------+--------+--------
 [email protected] | eu.siacs.conversations.axolotl.bundles:11095     |        | pep    |   1377
 [email protected] | eu.siacs.conversations.axolotl.bundles:3850      |        | pep    |   1378
 [email protected] | eu.siacs.conversations.axolotl.bundles:636623660 |        | pep    |   1379
 [email protected] | eu.siacs.conversations.axolotl.devicelist        |        | pep    |   1380
(4 rows)

ejabberd=# SELECT * FROM pubsub_item WHERE nodeid = '1380';
 nodeid |    itemid     |       publisher        |       creation       |     modification     |                                                     payload                                                      
--------+---------------+------------------------+----------------------+----------------------+------------------------------------------------------------------------------------------------------------------
   1380 | 67A4B19AB0F4D | [email protected]/user | 001658:307610:731732 | 001658:307610:731732 | <list xmlns='eu.siacs.conversations.axolotl'><device id='-1'/><device id='3850'/><device id='636623660'/></list>
(1 row)

Note that I saw this randomly testing aTalk (https://github.com/cmeng-git/atalk-android/issues/185#issuecomment-1188626882) and now I've repro this scenario on a new clean account.

PIng @jcbrand

/LE: Converse - user - OMEMO - remove other keys - fixes it

<device id='3850'/><device id='636623660'/></list>

...but on aTalk restart it's broken again

<device id='-1'/><device id='3850'/><device id='636623660'/></list>

...while the bundles list is unchanged.

licaon-kter avatar Jul 20 '22 09:07 licaon-kter

The error message indicates that there is an error in the OMEMO database record, where the fingerPrint table column contains no data i.e. null. Currently in OMEMO protocol implementations, there is NO defined standard method to deal with cleaning up the OMEMO PEP data that are resided on server database. Whenever the same user account is being deleted/created, the server will keep the old/obsoleted OMEMO pep records and create a new one.

All these OMEMO identities (new and obsoleted) will be send to the User whenever user log into the server. There is some mechanisms built into the smack OMEMO implementation to help remove the old OMEMO identities, but it is no fool proof. During aTalk development, it was found that the fingerPrint can accidentally be cleared to null. The OMEMO identities corruption happens on aTalk and other devices that support OMEMO e.g. conversations.

To resolve the data corruption, you can try to use the two aTalk built-in option i.e. a. "Purge unused identities" b. Regenerate OMEMO identities.

First try perform (a) to see it helps to clear the problem. If failed then perform (b).

omemo_option

Below are the all the discussions on OMEMO issues with smack development team during its implementation: https://discourse.igniterealtime.org/t/smack-omemo-4-4-0-encounters-null-object-reference-while-sending-an-omemo-message/89496/11 https://discourse.igniterealtime.org/t/smack-4-4-0-what-is-the-proper-way-for-app-to-handle-when-there-is-corrupted-omemokey-exception-in-user-own-omemo-device/89542/4 https://discourse.igniterealtime.org/t/smack-4-4-0-what-is-the-proper-way-for-app-to-handle-when-there-is-corrupted-omemokey-exception-in-user-own-omemo-device/89542 https://discourse.igniterealtime.org/t/smack-4-4-0-what-is-the-proper-way-for-app-to-handle-when-there-is-corrupted-omemokey-exception-in-user-own-omemo-device/89542 https://discourse.igniterealtime.org/t/smack-4-4-0-atalk-regenerate-omemo-identities-implementation-and-problem-faced/89565/8 https://discourse.igniterealtime.org/t/smack-4-4-0-atalk-regenerate-omemo-identities-implementation-and-problem-faced/89565 https://discourse.igniterealtime.org/t/smack-4-4-0-atalk-regenerate-omemo-identities-implementation-and-problem-faced/89565 https://discourse.igniterealtime.org/t/smack-4-4-0-smack-omemo-does-not-auto-repair-an-corrupted-identitykey-even-after-new-requested-bundle-received-from-server/89615 https://discourse.igniterealtime.org/t/smack-4-4-0-smack-omemo-does-not-auto-repair-an-corrupted-identitykey-even-after-new-requested-bundle-received-from-server/89615/5

cmeng-git avatar Jul 21 '22 00:07 cmeng-git

Whenever the same user account is being deleted/created, the server will keep the old/obsoleted OMEMO pep records and create a new one.

Not sure how other servers behave, but that's not true for ejabberd.

When I unregister the account, the nodes are deleted too. When I re-register (create basically) the same account there are no old nodes.

licaon-kter avatar Jul 21 '22 03:07 licaon-kter

To support OMEMO messaging, each newly registered android OMEMO device creates and saves its omemo data into ejabberd pubsub_node table on the server. The table contains the bundles and devicelist; the bundle row is identified with a bareJid and a deviceId. From the enclosed table below e.g. [email protected], actually has only 2 active devices and the rest are inactive (stray left behind).

image

Each record of the omemoDevice also contains a nodeId that links to its other info in table pusub_item e.g. prekeys records etc.

When [email protected] account on an android device is removed, if ejabberd proceeds to clean up the omemo records for this account; it needs to remove all the active/inactive devices and its prekey tables based on bareJid only. This causes problem for the second android device registered with the same [email protected], and the OMEMO messaging will fail. Therefore all xmpp servier will never attempt to perform this action on its own when an account is removed.

In aTalk implementation, it actually provides a routine to clean up the server omemo data for the affected omemoDevice. The function is activated when user perform:

  • Clean up omemo local database, and omemo bundle and devicelist on the server for the specified accountId:
    1. When a user account is deleted.
    1. During Regenerate OMEMO identities
    /**
     * Clean up omemo bundle and devicelist on the server for the specified omemoDevice:
     *
     * @param connection XMPPConnection
     * @param userJid Account userJid
     * @param omemoDevice of which the bundle and devicelist items are to be removed from server
     */
    private void purgeBundleDeviceList(XMPPConnection connection, BareJid userJid, OmemoDevice omemoDevice)
    {
        Timber.d("Purge server bundle and deviceList for old omemo device: %s", omemoDevice);
        PubSubManager pubsubManager = PubSubManager.getInstanceFor(connection, userJid);
        PepManager pepManager = PepManager.getInstanceFor(connection);

        // First refresh omemo devicelist on the server i.e. removed old id
        Set<Integer> deviceIds = Collections.emptySet();
        try {
            String nodeName = OmemoConstants.PEP_NODE_DEVICE_LIST;
            LeafNode leafNode = pubsubManager.getLeafNode(nodeName);
            if (leafNode != null) {
                List<PayloadItem<OmemoDeviceListElement>> items = leafNode.getItems();
                if (!items.isEmpty()) {
                    // These will completely remove the deviceList - mny not be good
                    // leafNode.deleteAllItems();
                    // pubsubManager.deleteNode(nodeName);

                    OmemoDeviceListElement publishedList = items.get(items.size() - 1).getPayload();
                    deviceIds = publishedList.copyDeviceIds();  // need a copy of the unmodifiable list
                    deviceIds.remove(omemoDevice.getDeviceId());
                }
            }
            OmemoDeviceListElement deviceList = new OmemoDeviceListElement_VAxolotl(deviceIds);
            pepManager.publish(nodeName, new PayloadItem<>(deviceList));

        } catch (SmackException | InterruptedException | XMPPException.XMPPErrorException e) {
            aTalkApp.showToastMessage(R.string.omemo_purge_inactive_device_error, omemoDevice);
        }

        // Only then purge omemo preKeys table/bundles on server
        try {
            String nodeName = omemoDevice.getBundleNodeName();
            LeafNode leafNode = pubsubManager.getLeafNode(nodeName);
            if (leafNode != null) {
                leafNode.deleteAllItems();
                pubsubManager.deleteNode(nodeName);
            }
        } catch (SmackException | InterruptedException | XMPPException.XMPPErrorException e) {
            aTalkApp.showToastMessage(R.string.omemo_purge_inactive_device_error, omemoDevice);
        }
    }

cmeng-git avatar Jul 22 '22 00:07 cmeng-git

Ok, not sure how that relates here, but my steps don't involve any "account removal" in the client. Clients were clean up via "clean app data" for Android and browser "delete site data".

I've mentioned that to make it clear that any nodes that exist are there from the used apps only. I've made a new account to specifically repro the issue.

Can you PM me via xmpp to give you a new account for you to test with?

licaon-kter avatar Jul 22 '22 03:07 licaon-kter

Just realised that all my comments after 22 July date, following your last comment, do not appear in this discussion thread. Do not understand why this happen.

In short, I need the actual debug log when the above problem occur to fully understand the cause of the problem. With which, I am unable to proceed further.

cmeng-git avatar Aug 27 '22 00:08 cmeng-git

Debug log of?

licaon-kter avatar Aug 27 '22 07:08 licaon-kter

I need the aTalk debug log that contains the process flow when the problem occur. Please refer to the following info to generate and send the aTalk debug log.

How can I give feedback to the developer for any aTalk problem encountered?

cmeng-git avatar Aug 28 '22 00:08 cmeng-git

Close due to no feedback for further actions.

cmeng-git avatar Nov 06 '22 00:11 cmeng-git