cpg icon indicating copy to clipboard operation
cpg copied to clipboard

ThreadSafe TypeManager

Open anon767 opened this issue 3 years ago • 7 comments

Currently the TypeManager is implemented as Singleton Object. This causes some problems when trying to use the API in a multi-threaded manner.

Caused by: java.lang.NullPointerException: Cannot invoke "de.fraunhofer.aisec.cpg.frontends.LanguageFrontend.getScopeManager()" because "this.frontend" is null
	at de.fraunhofer.aisec.cpg.graph.TypeManager.getCommonType(TypeManager.java:446)
	at de.fraunhofer.aisec.cpg.graph.TypeManager.isSupertypeOf(TypeManager.java:570)
	at de.fraunhofer.aisec.cpg.graph.statements.expressions.CastExpression.typeChanged(CastExpression.java:85)
	at de.fraunhofer.aisec.cpg.graph.statements.expressions.Expression.lambda$refreshType$12(Expression.java:244)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at de.fraunhofer.aisec.cpg.graph.statements.expressions.Expression.refreshType(Expression.java:242)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:219)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1087)
	at de.fraunhofer.aisec.cpg.passes.TypeHierarchyResolver.accept(TypeHierarchyResolver.java:95)
	at de.fraunhofer.aisec.cpg.passes.TypeHierarchyResolver.accept(TypeHierarchyResolver.java:55)
	at de.fraunhofer.aisec.cpg.TranslationManager.analyze$lambda-2(TranslationManager.kt:96)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1764)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1756)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.helpAsyncBlocker(ForkJoinPool.java:1145)
	at java.base/java.util.concurrent.ForkJoinPool.helpAsyncBlocker(ForkJoinPool.java:3160)
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1924)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2085)

I think the conventional fix would be to add a synchronized qualifier to getInstance(). A more efficient way (but maybe "polluting") is to keep a ThreadLocal storage with an instance per thread.

anon767 avatar Jan 31 '22 21:01 anon767

Currently the TypeManager is implemented as Singleton Object. This causes some problems when trying to use the API in a multi-threaded manner.

Caused by: java.lang.NullPointerException: Cannot invoke "de.fraunhofer.aisec.cpg.frontends.LanguageFrontend.getScopeManager()" because "this.frontend" is null
	at de.fraunhofer.aisec.cpg.graph.TypeManager.getCommonType(TypeManager.java:446)
	at de.fraunhofer.aisec.cpg.graph.TypeManager.isSupertypeOf(TypeManager.java:570)
	at de.fraunhofer.aisec.cpg.graph.statements.expressions.CastExpression.typeChanged(CastExpression.java:85)
	at de.fraunhofer.aisec.cpg.graph.statements.expressions.Expression.lambda$refreshType$12(Expression.java:244)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at de.fraunhofer.aisec.cpg.graph.statements.expressions.Expression.refreshType(Expression.java:242)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:219)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at de.fraunhofer.aisec.cpg.helpers.SubgraphWalker.refreshType(SubgraphWalker.java:216)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1087)
	at de.fraunhofer.aisec.cpg.passes.TypeHierarchyResolver.accept(TypeHierarchyResolver.java:95)
	at de.fraunhofer.aisec.cpg.passes.TypeHierarchyResolver.accept(TypeHierarchyResolver.java:55)
	at de.fraunhofer.aisec.cpg.TranslationManager.analyze$lambda-2(TranslationManager.kt:96)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1764)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1756)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.helpAsyncBlocker(ForkJoinPool.java:1145)
	at java.base/java.util.concurrent.ForkJoinPool.helpAsyncBlocker(ForkJoinPool.java:3160)
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1924)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2085)

I think the conventional fix would be to add a synchronized qualifier to getInstance(). A more efficient way (but maybe "polluting") is to keep a ThreadLocal storage with an instance per thread.

Not sure, if synchronized alone will help here. A better solution would be to get rid of the frontend property and have the frontend supplied into the necessary functions as a parameter or check why it is even necessary. To be honest, I do not quite get all the magic that is being done in getCommonType.

oxisto avatar Jan 31 '22 21:01 oxisto

Not sure, if synchronized alone will help here.

Me neither, thats why I suggested using ThreadLocal

A better solution would be to get rid of the frontend property and have the frontend supplied into the necessary functions as a parameter or check why it is even necessary.

This sounds like the best option here, since, imho, the less global state the better.

To be honest, I do not quite get all the magic that is being done in getCommonType.

Hmm, a lot is happening there :D

anon767 avatar Jan 31 '22 21:01 anon767

This fixed my Issues associated with the concurrency exceptions: https://github.com/anon767/cpg/blob/c9e2de2dc9966eb814c16cb5ed0c086e2f8b1a94/cpg-core/src/main/java/de/fraunhofer/aisec/cpg/graph/TypeManager.java#L73

anon767 avatar Jan 31 '22 23:01 anon767

This fixed my Issues associated with the concurrency exceptions: https://github.com/anon767/cpg/blob/c9e2de2dc9966eb814c16cb5ed0c086e2f8b1a94/cpg-core/src/main/java/de/fraunhofer/aisec/cpg/graph/TypeManager.java#L73

Hmm did you run the unit tests by any chance? wouldn't this now create different type managers per thread, with individual type hash maps / sets. Not sure if this will work properly.

oxisto avatar Feb 01 '22 09:02 oxisto

This fixed my Issues associated with the concurrency exceptions: https://github.com/anon767/cpg/blob/c9e2de2dc9966eb814c16cb5ed0c086e2f8b1a94/cpg-core/src/main/java/de/fraunhofer/aisec/cpg/graph/TypeManager.java#L73

Hmm did you run the unit tests by any chance? wouldn't this now create different type managers per thread, with individual type hash maps / sets. Not sure if this will work properly.

Yes you are right. Its more of a dirty hack for my purposes and overall 4 tests are failing (in TypeTests).

But please correct me if Im wrong, if Im doing something like this:

files.parallelStream().forEach { analyze(listOf(it) ,...) }

Things should work as expected right? Since every run of the passes for a single file should be isolated anyhow.

anon767 avatar Feb 01 '22 09:02 anon767

This fixed my Issues associated with the concurrency exceptions: https://github.com/anon767/cpg/blob/c9e2de2dc9966eb814c16cb5ed0c086e2f8b1a94/cpg-core/src/main/java/de/fraunhofer/aisec/cpg/graph/TypeManager.java#L73

Hmm did you run the unit tests by any chance? wouldn't this now create different type managers per thread, with individual type hash maps / sets. Not sure if this will work properly.

Yes you are right. Its more of a dirty hack for my purposes and overall 4 tests are failing (in TypeTests).

But please correct me if Im wrong, if Im doing something like this:

files.parallelStream().forEach { analyze(listOf(it) ,...) }

Things should work as expected right? Since every run of the passes for a single file should be isolated anyhow.

I would assume that your analysis is not 100% accurate, especially if types are used across different set of files that are spread across different threads, as you are missing a global picture of all types. Not sure, if this is of practical relevance though.

oxisto avatar Feb 01 '22 10:02 oxisto

Right, but my goal is to analyze as many single, isolated files as fast as possible :) As long as the API is not threadsafe I guess I have to stick to that hack :)

anon767 avatar Feb 01 '22 10:02 anon767