psalm icon indicating copy to clipboard operation
psalm copied to clipboard

Psalm fails when processing more than 512 files when using intersection types

Open joaojacome opened this issue 2 years ago • 2 comments

Hello!

I've been having some failures with Psalm in a project, and looks like Psalm is failing when there's more than 512 files being analysed, whenever it runs with more than 1 thread.

At first I thought it could've been something in my local env, but after some tries, I managed to reproduce it in a totally separated environment.

Affected versions

  • 5.0.0-beta1

Couldn't reproduce it on master due to another issue.

How to reproduce

I've created this repo here, and I've added a github action to reproduce it:

https://github.com/joaojacome/psalm-test

joaojacome avatar Aug 09 '22 20:08 joaojacome

Hey @joaojacome, can you reproduce the issue on https://psalm.dev ?

psalm-github-bot[bot] avatar Aug 09 '22 20:08 psalm-github-bot[bot]

Thanks for the reproducer, that backtrace should be helpful.

AndrolGenhald avatar Aug 10 '22 00:08 AndrolGenhald

@joaojacome would you mind triggering the GHA build again? The logs have since expired.

weirdan avatar Dec 03 '22 04:12 weirdan

@weirdan I've triggered it now.

joaojacome avatar Dec 03 '22 13:12 joaojacome

Seems like 4.26 fails directly do to missing support for intersections. the beta seems to be consistent with dev-master. dev-master fails for : Intersection type could not be resolved in /home/runner/work/psalm-test/psalm-test/vendor/vimeo/psalm/src/Psalm/Internal/PhpVisitor/Reflector/TypeHintResolver.php:117

https://github.com/joaojacome/psalm-test/actions/runs/3608733479/jobs/6081505418#step:9:32

It seems do to so independently from the number of files but I'm not sure I see what intersection it complains about

orklah avatar Dec 03 '22 15:12 orklah

Looking at those runs, beta (and only that version) does emit tons of UndefinedClass when there are more than 512 files.

weirdan avatar Dec 04 '22 02:12 weirdan

And it's likely because it switched to multithreaded scan here:

https://github.com/vimeo/psalm/blob/a157743140fdfd7004491f55a5e30232abd4847c/src/Psalm/Internal/Codebase/Scanner.php#L334-L352

weirdan avatar Dec 04 '22 02:12 weirdan

I've had a look at this problem (we're running into this problem consistently at my company). I've not been able to create a fix, but here is my analysis of the problem:

I've worked with a minimal analysis available at https://github.com/annervisser/psalm-512-files-error-repro which I'll use as example. The code looks like this:

<?php // Bar.php

namespace App;

class Bar
{
    public Foo&Bar $prop3;
}
<?php // Foo.php

namespace App;

class Foo
{
    public Foo&Bar $prop3;
}

Analysis

  • It is indeed caused by the multithreading in src/Psalm/Internal/Codebase/Scanner.php
  • One thread encounters Foo.php, and adds Foo to the known classes (ClassLikes::$existing_classes_lc) ^1
  • It then encounters the property with type Foo&Bar and tries to resolve the intersection type ^2
  • intersectUnionTypes goes on to check if Bar exists, which it thinks it doesn't (because it hasn't yet loaded or scanned Bar.php [^3]
  • Another thread goes through the same process for Bar.php
  • The information from these two threads is then merged in addThreadData [^4]
  • In addThreadData the current data is prioritized, so whichever thread's data is set first determines which of the two classes (Foo or Bar) is set to false in the main thread. [^5]
  • The value of one of the classes being false in ClassLikes::$existing_classes_lc causes the errors in both the files. [^6]

Possible solution

  1. I'm assuming the fact that intersectUnionTypes checking if a class exists before it is loaded is a bug, and that that needs to be fixed. I'm not sure how to go about this: Make sure the class is loaded or change the intersection logic so it doesn't need to check
  2. If I'm wrong about that, another solution is to be more clever about merging the data in addThreadData. Instead of merging the arrays, we'd have to prioritize true values over false, seems hacky... I've verified this works for my case, but I'm not sure if it holds up with larger datasets.

[^3]: TypeHintResolver.php(113): Type::intersectUnionTypes() -> Type::intersectAtomicTypes -> AtomicTypeComparator::isContainedBy -> Codebase->classOrInterfaceOrEnumExists [^4]: https://github.com/vimeo/psalm/blob/2e90cb6c6afb3947fadade9ff416a19c6d6d48f0/src/Psalm/Internal/Codebase/ClassLikes.php#L2341 [^5]: https://github.com/vimeo/psalm/blob/2e90cb6c6afb3947fadade9ff416a19c6d6d48f0/src/Psalm/Internal/Codebase/ClassLikes.php#L2362 (note $this->existing_classes_lc being the second argument) [^6]: https://github.com/vimeo/psalm/blob/1c19260cddea29803cba2ef31c3a49189bcdeb0c/src/Psalm/Internal/Analyzer/ClassLikeAnalyzer.php#L260-L265

annervisser avatar Mar 01 '23 21:03 annervisser

I'm assuming the fact that intersectUnionTypes checking if a class exists before it is loaded is a bug

It certainly is. Even in a single-threaded scan, we may encounter Foo&Bar before we ever see Bar. For class constants where I believe we first encountered this, we postpone the actual type resolution until after the whole codebase is scanned and all symbols are known. Perhaps we should use a similar solution here.

weirdan avatar Mar 01 '23 22:03 weirdan

We've got the same problem in our project. This basically prevents one from using intersection types. But we found a temporary workaround. For some reason, using only one of the classes as the parameter and adding a doc block annotation with @param Foo&Bar does not lead to the same issue. This way we only have intersection types on "compile time" but not on runtime.

christian-kolb avatar Jul 19 '23 08:07 christian-kolb