opal icon indicating copy to clipboard operation
opal copied to clipboard

Special handling for `Thread.start` in CG construction does not work when fields are involved

Open johannesduesing opened this issue 1 year ago • 1 comments

The Problem Currently (even with #229 in place), OPAL will not detect the call to run for the following example. I want to point out that i'm not trying to construct unnecessarily complex artificial examples here, this is an actual issue that prevents me from analyzing DGMF, specifically this line 😄

class FieldsDemo implements Runnable {

    private static Thread theThread;

    public static void main(String[] args){
        theThread = new Thread(new FieldsDemo());

        theThread.start();
    }

    @Override
    public void run(){
        System.out.println("Running");
    }
}

The reason for this is twofold:

  • TypeIterator.foreachAllocation does not consider field reads (neither GetField nor GetStatic) as valid allocations.
  • ThreadRelatedCallsAnalysis.handleThreadInit only processes a thread definition site if it is a direct definition, meaning a constructor invocation. It does not attempt to find possible definition sites for an object when it is retrieved from within a field.

What i thought the solution could be Honestly, i'm a little unsure as to how a proper solution for this would look like. In a perfect world, i would:

  • Look into all field write accesses to the field that is being retrieved
  • Find all definition sites for the variables that are assigned to the field via those write accesses
  • Recursively invoke ThreadRelatedCallsAnalysis.handleThreadInit for each of those actual definition sites

What i've done so far: I implemented a simple solution that solves the example above. It only looks for field write accesse inside the current method, and recursively invokes handleThreadInit on all definition sites. The issue for me is that in order to analyze possible definition sites in other methods, i need to obtain their TACAI results, and i (think i) would need to obtain their typeIteratorState as well - i'm not sure if and how that works. My current solution looks something like this (inside ThreadRelatedCallsAnalysis.handleThreadInit, plus minor fixes for TypeIterator.foreachAllocation:

        case Assignment(_, _, GetStatic(_, declClass, name, declType)) =>
            val declFields = project.get(DeclaredFieldsKey)
            val theDeclaredField = declFields(declClass, name, declType)
            val fieldAccesses = project.get(FieldAccessInformationKey)
            val tac = typeIteratorState.tac

            fieldAccesses
                .writeAccesses(theDeclaredField.definedField)(declFields)
                .filter(_._1 == typeIteratorState.callContext.id)
                .foreach{ case (_, pc, _, _) =>
                    val putStatic = stmts(tac.pcToIndex(pc)).asPutStatic
                    if(putStatic.value.isVar){
                        putStatic
                            .value
                            .asVar
                            .definedBy
                            .filter(_ >= 0)
                            .foreach{ defSite =>
                                handleThreadInit(callContext, callPC, allocationContext, defSite, stmts, partialAnalysisResults)
                            }
                    }


                }

Questions

  • Do we event want to handle this fully, or is it out of scope for the analysis?
  • This adds the DeclaredFieldsKey and FieldAccessInformationkey to the required project information, would this be a performance issue? Is the trade-off worth it?

johannesduesing avatar Nov 11 '24 16:11 johannesduesing

Which call graphs did you try? I think the first problem should not exist for the AllocationSiteBasedPointsToCallGraphKey (i.e., 0-1-CFA). This is by design, though, at least in theory, we could extend the simple type iterators (CHA, etc.) to do something like your suggestion to capture (some) additional relevant cases. I don't think we should do this inside an analysis like the ThreadRelatedCallsAnalysis, because that would make behavior inconsistent between different such analyses or lead to code duplication. Whether this is a good idea or whether the necessary addition of the FieldKeys are a good idea might be debatable. A solution could be to make this accessible in a modular way, e.g. with type iterators that only support simple locale allocations as CHA, etc. do now, and ones that also support some more complex cases involving fields. Could lead to a proliferation of TypeIterators though.

The second problem is actually only an extension of the first one, since allocations by definition kind of have to be constructor invocations (not completely true, reflective construction of objects or native methods returning freshly constructed objects also count - another improvement to the ThreadRelatedCallsAnalysis might be in order to deal with such allocations).

errt avatar Nov 12 '24 12:11 errt