credo-ts icon indicating copy to clipboard operation
credo-ts copied to clipboard

Performance degradation on networks with large ammount of nodes

Open jleach opened this issue 1 year ago • 18 comments

TL;DR

Although it seems that ledger network performance doesn't directly impact a wallet's ability to complete a transaction (such as accepting a credential), the number of network connections does matter. Wallets with fewer connections to the ledger network tend to complete transactions noticeably faster. This highlights the importance of optimizing the number of network connections for efficiency.

Problem

When using Aries Bifold, BC Wallet, or AFJ to "Accept" a credential, the process can become frustratingly slow on ledgers with many nodes.

Analysis

The table below provides information about test results. These tests were conducted on the sovrin:staging environment using the LSBC Test credential.

In testing process, the same tests were run three times for each platform. However, in the table, we've displayed only the two best results for each platform, with the exception of Orbi Edge.

For Orbi Edge, the first test result is shown, even though it was slower. This initial test showed a notably higher number of network connections. Subsequent tests for Orbi Edge were more optimized.

In the table, you'll find two key columns:

  • "Network": This column indicates the number of network connections that were established during the test.
  • "Duration": This column provides information about the time it took for the wallet to register either "success" or "failure" when accepting the credential.
No. Platform Network Duration Comment
1 Bifold iOS 28 28 sec AFJ/IndyVDR
2 Bifold iOS 36 23 sec AFJ/IndyVDR
3 Trinsic iOS 206 60 sec Fail
4 Trinsic iOS 209 60 sec Fail
5 Lissi iOS 230 60 sec Fail
6 Lissi iOS 242 60 sec Fail
7 Node.js Linux AFJ/IndyVDR
8 Node.js Linux AFJ/IndyVDR
9 Orbi Edge iOS 7 5 sec
10 Orbi Edge iOS 2 3 sec

During testing, both Trinsic and Lissi encountered issues when attempting to accept the credential. After waiting for ~60 seconds, an error message appeared.

It was noted via logging that when accepting a credential, AFJ makes a series of network calls, which are logged as follows:

  1. Get credential definition
  2. Get transaction
  3. Get credential definition
  4. Get transaction
  5. Get schema
  6. Get revocation registry definition

To accurately assess network performance and duration, these calls were replicated using the IndyVDR Proxy and cURL. The results of these tests are documented in the table below.

No. Platform Network Duration Comment
1 IndyVDR Linux 8 4 sec
2 IndyVDR Linux 10 4 sec

It is not know what framworks Orbi Edge, Lissi, or Trinsic use.

In the case of Lissi and Trinsic, our observations indicate that they scan all configured ledgers while in the process of accepting a credential. This thorough scanning approach likely played a role in reaching a timeout at approximately 60 seconds, resulting in a test failure.

NOTE Preliminary testing with Trinsic showd it getting similar results to Orbi Edge, however, after a re-install this was not the case and the above results were collected.

Conclusion

This slowness seems to stem from AFJ or IndyVDR making numerous network calls. While each call is quick on its own, the cumulative effect can lead to significant delays when considering response processing.

Our test results reveal some key insights:

  1. Ledger Network Performance: It's worth noting that ledger network performance doesn't seem to significantly impact the results, as evidenced by the efficient performance of IndyVRD Linux and Orbi Edge. Orbi Edge impressively completes transactions in just 3 seconds. Even when we eliminate duplicate network calls from IndyVDR Linux, it achieves similar results.

  2. Number of Network Connections: On the other hand, the number of network connections appears to be a notable factor in test results. Tests that establish fewer connections, possibly the minimum required, tend to complete significantly faster compared to those that create multiple network connections.

In light of these findings, we recommend that AFJ and IndyVDR consider optimizing their ledger network interactions by:

  1. Removing Duplicate Network Calls: Identify and eliminate any duplicate network calls that are part of the same transaction to reduce redundancy.

  2. Batching Queries: Implement a strategy to batch queries, sending them to the same two nodes over the same connection, rather than establishing multiple network connections for each query.

  3. Network Reconciliation: Consider periodic network reconciliation or scheduling intervals for this process, separating it from critical transactions such as accepting a credential or processing proof requests.

  4. Caching Transactions: Explore the possibility of caching ledger transactions, given their immutability, to improve efficiency.

These optimizations could enhance the overall performance of ledger network interactions. Thank you for your attention to these recommendations, which aim to streamline the process for a smoother user experience.

How To Reproduce

Use a demo on Sovrin Test/Staging. If you don't have one, use this email verification service. Watch your firewall logs, time the result. If you have a pfSense based router with pfTop you can run this filter: tcp dst port 9700||9702||9744||9777||9778||9799 and out.

jleach avatar Oct 23 '23 16:10 jleach