corda icon indicating copy to clipboard operation
corda copied to clipboard

Quadratic runtime when loading signed CorDapps

Open bjornbugge opened this issue 4 years ago • 3 comments

Background

Running a flow for the first time in a signed CorDapp gives unacceptable startup time. In our CorDapp, the time it takes to run the very first flow goes up from ~5-8s to ~60s if the CorDapp is signed. The CorDapp has 5683 class files and is roughly 7,8 MB in size.

Relevant versions

  • Corda: 4.8 (both opensource and EE)
  • Quasar: 0.7.13_r3
  • JDK 8 u202

Diagnosis

What makes things slow?

In commit https://github.com/corda/corda/commit/1a7401472fc465a196dd05f6deecc98a68cdbb96 the global flag defaultUseCaches was set to false in order to fix a file handle leak (#6869). When one loads a class file in a JAR file a new “file connection” is created to the JAR file using a factory singleton. If caching is disabled, this will always create a new instance of the class java.util.jar.JarFile and return it. This change is documented in CORDA-4120. The file handler leak comes from a bug in the JDK https://bugs.openjdk.java.net/browse/JDK-8156014.

Reading the contents of one particular class file in the JAR requires one to open up an InputStream on the JAR. The method java.util.jar.JarFile.getInputStream ensures that, if the JAR is signed (contains signature files), the SHA256 message digest of all files in the JAR is computed and the signature is verified. This expensive computation is performed only once in the lifetime of the java.util.jar.JarFile instance. However, because caching is disabled we’ll get a fresh instance for every single class file access we do, which will result in a quadratic runtime behaviour.

What activates it?

The Quasar Java agent attempts to instrument all classes loaded by the classloader to make them suspendable. To do this, it opens up an InputStream to each class file that the class loader wants to load and inspects the bytecode of the class file. Because of the behaviour described above, this has a significant impact on performance if the JAR that contains the class file happens to be signed.

Why is it not slow the second time?

Quasar itself contains a simple caching mechanism in its co.paralleluniverse.fibers.instrument.MethodDatabase class, which will remember the results of instrumenting each loaded class. Hence, when Quasar is invoked the second time and intercepts class loading commands for classes in a signed JAR it’s already seen, Quasar will not open up the class resource input stream at all and thus avoid the quadratic behaviour. This is the reason that subsequent invocations of a CSL flow (e.g., instantiating a contract for a second time) is much faster.

bjornbugge avatar Sep 10 '21 12:09 bjornbugge

Automatically created Jira issue: CORDA-4163

r3jirabot avatar Sep 10 '21 12:09 r3jirabot

Logged in the backlog for an internal review

nargas-ritu avatar Jan 11 '22 13:01 nargas-ritu

Quasar itself contains a simple caching mechanism in its co.paralleluniverse.fibers.instrument.MethodDatabase class, which will remember the results of instrumenting each loaded class.

Corda only instruments classes when they are first loaded by their ClassLoader, and we do not unload flow classes. Hence Qusasr does not need to be invoked at all when running a flow for the second time, which is a more likely explanation for this being faster than the first time.

chrisr3 avatar May 23 '22 15:05 chrisr3