psalm icon indicating copy to clipboard operation
psalm copied to clipboard

Taint analysis does not detect tainted sources in Iterator implementations

Open cedric-anne opened this issue 4 months ago • 3 comments

I work on an application that has its own logic to fetch the data from the database. Basically, when the database is requested, the result is then handled through an Iterator implementation. I used the @psalm-taint-source input tag to indicate that its current() method returns unsecure data, but it is not detected by Psalm in foreach loops.

Code:

<?php

class DataIterator implements Iterator
{
    private int $cursor = 0;

    public function __construct(private array $data)
    {}

    public function next(): void
    {
        $this->cursor++;
    }

    public function valid(): bool
    {
        return $this->cursor < count($this->data);
    }

    /**
     * @psalm-taint-source input
     */
    public function current(): mixed
    {
        return $this->data[$this->cursor];
    }

    public function rewind(): void
    {
        $this->cursor--;
    }

    public function key(): mixed
    {
        return $this->cursor;
    }
}

$data = $_GET; // $data is unsecure

// Undetected taint in foreach
$rows = new DataIterator($data);
foreach ($rows as $row) {
    echo json_encode($row);
}

// Detected taint when manually using the `current()` method
$rows = new DataIterator($data);
while ($rows->valid()) {
    $row = $rows->current();

    echo json_encode($row);

    $rows->next();
}

Result:

ERROR: TaintedHtml - tmp/test.php:52:10 - Detected tainted HTML (see https://psalm.dev/245)
  DataIterator::current - tmp/test.php:23:21
    public function current(): mixed

  $row - tmp/test.php:50:5
    $row = $rows->current();

  call to json_encode - tmp/test.php:52:22
    echo json_encode($row);

  json_encode#1 - tmp/test.php:52:22
    echo json_encode($row);

  json_encode - vendor/vimeo/psalm/stubs/CoreGenericFunctions.phpstub:1543:10
function json_encode(mixed $value, int $flags = 0, int $depth = 512) {}

  call to echo - tmp/test.php:52:10
    echo json_encode($row);



ERROR: TaintedTextWithQuotes - tmp/test.php:52:10 - Detected tainted text with possible quotes (see https://psalm.dev/274)
  DataIterator::current - tmp/test.php:23:21
    public function current(): mixed

  $row - tmp/test.php:50:5
    $row = $rows->current();

  call to json_encode - tmp/test.php:52:22
    echo json_encode($row);

  json_encode#1 - tmp/test.php:52:22
    echo json_encode($row);

  json_encode - vendor/vimeo/psalm/stubs/CoreGenericFunctions.phpstub:1543:10
function json_encode(mixed $value, int $flags = 0, int $depth = 512) {}

  call to echo - tmp/test.php:52:10
    echo json_encode($row);



------------------------------
2 errors found
------------------------------

There are thousands of foreach usages in the application, so replacing them by a while loop is not an option.

Is detecting the Traversable type in foreach loops (and in current(), next(), prev(), end() functions usages) is something possible in Psalm ?

cedric-anne avatar Aug 07 '25 07:08 cedric-anne

Hey @cedric-anne, can you reproduce the issue on https://psalm.dev? These will be used as phpunit tests when implementing the feature or fixing this bug.

psalm-github-bot[bot] avatar Aug 07 '25 07:08 psalm-github-bot[bot]

Yep absolutely possible, will implement when I get a break, can also accept PRs to implement this, the file to edit is https://github.com/vimeo/psalm/blob/6.x/src/Psalm/Internal/Analyzer/Statements/Block/ForeachAnalyzer.php, and it can be implemented by simply analyzing a set of virtual current(), next(), prev(), end() calls on the iterator in handleIterable (like those already used for getIterator or offsetGet elsewhere)

danog avatar Aug 07 '25 08:08 danog

I've tried to apply your suggestion but without success (to be fair, I have zero knowledge about psalm's internals so it isn't completely unexpected).

Here's the diff to add the required test case if someone else want to try it:

diff --git a/tests/TaintTest.php b/tests/TaintTest.php
index f29294f17..c731e874f 100644
--- a/tests/TaintTest.php
+++ b/tests/TaintTest.php
@@ -2650,6 +2650,54 @@ final class TaintTest extends TestCase
                     B::test($foo);',
                 'error_message' => 'TaintedCallable',
             ],
+            'taintFromIterator' => [
+                'code' => <<<'PHP'
+                    <?php
+
+                    class DataIterator implements Iterator
+                    {
+                        private int $cursor = 0;
+
+                        public function __construct(private array $data)
+                        {}
+
+                        public function next(): void
+                        {
+                            $this->cursor++;
+                        }
+
+                        public function valid(): bool
+                        {
+                            return $this->cursor < count($this->data);
+                        }
+
+                        /**
+                         * @psalm-taint-source input
+                         */
+                        public function current(): mixed
+                        {
+                            return $this->data[$this->cursor];
+                        }
+
+                        public function rewind(): void
+                        {
+                            $this->cursor--;
+                        }
+
+                        public function key(): int
+                        {
+                            return $this->cursor;
+                        }
+                    }
+
+                    $data = $_GET; // $data is unsecure
+                    $rows = new DataIterator($data);
+                    foreach ($rows as $row) {
+                        echo json_encode($row);
+                    }
+                PHP,
+                'error_message' => 'TaintedHtml',
+            ],
         ];
     }

AdrienClairembault avatar Aug 11 '25 12:08 AdrienClairembault