NullAway icon indicating copy to clipboard operation
NullAway copied to clipboard

JSpecifyMode: Incomplete checking on reads from `Iterable<@Nullable X>` and `@Nullable X[]`

Open jeffrey-easyesi opened this issue 4 months ago • 6 comments

Explicit use of .iterator()/.next() is checked correctly (if Iterator is null-marked by LibraryModels), but the equivalent for-each loop is not:

void use(Integer x) {}
void f1(Iterable<@Nullable Integer> list) {
    for (var it = list.iterator(); it.hasNext(); ) { use(it.next()); }    // warning, it.next() is nullable
    for (var it = list.iterator(); it.hasNext(); ) { int x = it.next(); } // warning, it.next() is nullable

    for (Integer x : list) use(x);  // no warning
    for (int x : list) {}           // no warning
}

Also, I tested the analogous loops with an array, and found that there's no unboxing checking on either:

void f2(@Nullable Integer[] array) {
    for (int i = 0; i < array.length; i++) { use(array[i]); }    // warning, array[i] is nullable
    for (int i = 0; i < array.length; i++) { int x = array[i]; } // no warning
    
    for (Integer x : array) use(x); // warning, x is nullable
    for (int x : array) {}          // no warning
}

jeffrey-easyesi avatar Aug 21 '25 21:08 jeffrey-easyesi

I looked into these a bit. There are actually 3 different issues:

1. for of reference over Iterable<@Nullable Foo>:

The synthetic iterator variable in dataflow isn't built with the right element type: it's literally an "Iterator<T>". MethodInvocationNode.getIterableExpression() might be useful - on synthetic next() calls it returns the original Iterable expression tree.

2. for of primitive over either iterable or array:

NullAway.matchEnhancedForLoop just doesn't do any checking of the element type.

3. int x = array[i]

This is a weird one. The first call to AccessPathNullnessAnalysis.visitArrayAccess on the array[i] node correctly determines the result is nullable. Then, visitMethodInvocation on the synthetic array[i].intValue() node stores a non-null access path for array[i] in its result, which makes sense. But then array[i] gets visited again and sees the access path as non-null, even though that shouldn't be visible from the input side.

Looking at the CFG structure, the same ArrayAccessNode is appearing in blocks both before and after the method invocation. I think this is a bug in Checker Framework's dataflow library. Checker Framework itself seems to be affected too - it doesn't warn on int x = array[i] either.

jeffrey-easyesi avatar Aug 27 '25 15:08 jeffrey-easyesi

Thanks so much @jeffrey-easyesi for looking into these and for the report! We'll try to dig deeper as soon as we can.

msridhar avatar Aug 27 '25 15:08 msridhar

Also, if you happen to have cycles to open a PR on any of these that would be welcome. But understood if you can't, we still appreciate the report.

msridhar avatar Aug 27 '25 15:08 msridhar

@jeffrey-easyesi did you make both Iterable and Iterator null-marked via a library model to reproduce those cases?

msridhar avatar Sep 02 '25 15:09 msridhar

We do have a LibraryModels that null-marks many common collection-related classes, including both java.lang.Iterable and java.util.Iterator. The main purpose of this LibraryModels is to prevent unintentionally adding nullables to a collection of non-null elements, and it does work well for that.

But I found that, either with or without this LibraryModels, some forms of iteration over nullable elements still aren't quite safe. (I only consider these to be minor issues, since collections with nullable element types are rare.)

jeffrey-easyesi avatar Sep 02 '25 17:09 jeffrey-easyesi

I'm just trying to figure out the best way, for the moment, to write a regression test for some of these cases involving Iterables. At some point (hopefully soon-ish) we should have better library models for the JDK so we don't need to do that. Thanks!

msridhar avatar Sep 02 '25 17:09 msridhar