PHP_CodeSniffer icon indicating copy to clipboard operation
PHP_CodeSniffer copied to clipboard

Added ability to baseline violations

Open frankdekker opened this issue 3 years ago • 19 comments

Added the ability to baseline violations. #2543

Features

Create baseline phpcs src --report-baseline=phpcs.baseline.xml --basepath=/path/to/project/root

  • Writes phpcs.baseline.xml in the current working directory. Pref be the root of the project
  • basepath will be subtracted from the filepaths in the baseline xml.

Use default baseline Will by default search in the working directory for phpcs.baseline.xml phpcs src

Use custom location baseline phpcs src --baselineFile=<path/to/baseline.xml>

Changes

Added classes to read the baseline file and setup data structure. Added check in File::addMessage to check if the message is baselined. Added --baselineFile to cli options Added unit tests for the newly added files.

Documentation

My question, how do I update the documentation to add a section about the baseline feature?

frankdekker avatar Jul 11 '21 13:07 frankdekker

Just want to point out that this implementation doesn't baseline individual violations. Instead, it baselines specific sniffs for specific files. Which is different from what one might expect :)

sergey-solo avatar Jul 16 '21 13:07 sergey-solo

Just want to point out that this implementation doesn't baseline individual violations. Instead, it baselines specific sniffs for specific files. Which is different from what one might expect :)

Correct, will point that out in the documentation. It's a technical choice I took to avoid basing it on line number. Because that will become quickly annoying when you modify files :).

frankdekker avatar Jul 16 '21 14:07 frankdekker

I'm not convinced that this is a useful way of doing a baseline. When entire sniffs (or even sniff codes) are baselined for a file, adding new code with violations to that file wont show any errors if they happen to be the same sort of errors already in the file. For projects that need a baseline, or when introducing a new check to a standard, I think this is a pretty likely case.

I'd likely only accept a baseline feature if it was able to distinguish between violations in existing code and violations in new code. Anything else feels like an ongoing support issue for me as the feature wont work the way people expect and I'll have to defend the design decision for it. I'm absolutely happy to do this, but only if I believe the feature works the way it should.

This is one of those things where we really need a discussion about the solution design before a PR. There was some of that back on #2543, but it's been a while and no real solution was ever outlined. If you want to give this a go, it would be a terrific feature to have and I'd be happy to discuss any ideas you have for the solution over on that other issue.

gsherwood avatar Jul 18 '21 22:07 gsherwood

I'm not convinced that this is a useful way of doing a baseline. When entire sniffs (or even sniff codes) are baselined for a file, adding new code with violations to that file wont show any errors if they happen to be the same sort of errors already in the file. For projects that need a baseline, or when introducing a new check to a standard, I think this is a pretty likely case.

I'd likely only accept a baseline feature if it was able to distinguish between violations in existing code and violations in new code. Anything else feels like an ongoing support issue for me as the feature wont work the way people expect and I'll have to defend the design decision for it. I'm absolutely happy to do this, but only if I believe the feature works the way it should.

This is one of those things where we really need a discussion about the solution design before a PR. There was some of that back on #2543, but it's been a while and no real solution was ever outlined. If you want to give this a go, it would be a terrific feature to have and I'd be happy to discuss any ideas you have for the solution over on that other issue.

Completely understandable. This baseline mechanism will indeed suppress new issues in the same file. I'm happy to brainstorm how to make this feature more robust. The problem at hand is that we need a unique way to detect where the violations came from within the file, without explicitly counting tokens or line numbers.

I'll propose an approach in #2543

frankdekker avatar Jul 19 '21 07:07 frankdekker

@gsherwood Improved the baseline calculation based on the approach I suggested in #2543

Now a code signature will be calculated and added to the baseline report file. The code signature is the hash of the code on the line of the violation, and 1 line before and after.

During standard inspection when comparing against the baseline, the same signature will be calculated for encountered violations and compared against the baseline signature. The violation will only be skipped if the code signature is identical to the one in the baseline.

As mentioned, this will prevent a sniff being disabled for the entire file. The sniff will now be specifically disabled for the calculated code signature. Modifications in the file won't break the baseline, unless you edit one of the 3 lines around the baseline. At this point I feel it's quite useful to resolve the violation anyway, as you're editing that close to it.

I would love to hear your thoughts about this solution.

Regards

Frank Dekker

Example baseline:

<?xml version="1.0" encoding="UTF-8"?>
<phpcs-baseline version="3.6.1">
<violation file="src\Baseline\BaselineSet.php" sniff="PEAR.Commenting.FunctionComment.MissingParamName" signature="54accce13bb236dc361cf9d75895c274fce619d5"/>
<violation file="src\Baseline\BaselineSet.php" sniff="PEAR.Commenting.FunctionComment.MissingParamTag" signature="25a91da1dadc60a04a9a6615502ad4aa4fd397c2"/>
<violation file="src\Baseline\BaselineSet.php" sniff="PEAR.Functions.FunctionDeclaration.SpaceBeforeOpenParen" signature="edcfdf132eb46b842349404740ba7b8f224d82e1"/>
</phpcs-baseline>

frankdekker avatar Jul 31 '21 22:07 frankdekker

@gsherwood It's been a while. You had any chance yet to look at the latest proposal with the code signatures?

frankdekker avatar Oct 09 '21 11:10 frankdekker

Being able to have a baseline would be much appreciated.

@gsherwood Is there anything that can be done to move this forward?

Potherca avatar Mar 08 '22 15:03 Potherca

Being able to have a baseline would be much appreciated.

Is there anything that can be done to move this forward?

@Potherca At our company we really needed the baseline, so in the meantime we created the following package: https://github.com/123inkt/php-codesniffer-baseline

Might be an nice alternative for now.

frankdekker avatar Mar 08 '22 15:03 frankdekker

@gsherwood Hi Greg, would you be please so kind and take a look to Frank solution? It looks like it now mets your requirements about baseline functionality for phpcs. I am one of the many waiting for this feature, it will help us to rewrite large project. Thank you in advance! Martin ;-)

Housik avatar Apr 24 '22 18:04 Housik

@Housik This isn't something I have time to look at.

gsherwood avatar Apr 25 '22 22:04 gsherwood

Is there a chance to see this PR merged? It would be really useful for us too, especially when working with big codebases.

BafS avatar Oct 12 '22 03:10 BafS

Is there a chance to see this PR merged? It would be really useful for us too, especially when working with big codebases.

You can use https://github.com/123inkt/php-codesniffer-baseline in the meantime.

frankdekker avatar Oct 12 '22 06:10 frankdekker

@frankdekker unfortunately it didn't work well with our CI (because of editing a vendor file, which should be immutable).

I created a new report for that, without having the need to change a vendor, it's hacky but it works:


<?php declare(strict_types=1);

use PHP_CodeSniffer\Files\File;
use PHP_CodeSniffer\Reports\Full;
use PHP_CodeSniffer\Reports\Report;

final class BaselineReport implements Report
{
    public const BASELINE_FILE = __DIR__ . '/baseline.php';
    private Full $fullReport;

    public function __construct()
    {
        $this->fullReport = new Full();
    }

    private function cleanReportFromBaseline(array $baseline, array $report): array
    {
        $filename = $report['filename'];

        if (!isset($baseline[$filename])) {
            return $report;
        }

        $baselineSources = $baseline[$filename];
        $messages = &$report['messages'];

        foreach ($messages as $ka => $messageLine) {
            foreach ($messageLine as $kb => $messagesOff) {
                foreach ($messagesOff as $kc => $messageList) {
                    $source = $messageList['source'] ?? '';
                    if (isset($baselineSources[$source]) && $baselineSources[$source]['occurrences'] > 0) {
                        $type = $messages[$ka][$kb][$kc]['type'];

                        // Decrement the warnings/errors to keep in sync with the removal of messages in the report
                        match ($type) {
                            'WARNING' => $report['warnings']--,
                            'ERROR' => $report['errors']--,
                        };

                        unset($messages[$ka][$kb][$kc]);

                        $baselineSources[$source]['occurrences']--;
                    }
                }
            }
        }

        return $report;
    }

    public function generateFileReport($report, File $phpcsFile, $showSources = false, $width = 80): bool
    {
        $baseline = [];
        if (is_file(self::BASELINE_FILE)) {
            $baseline = require self::BASELINE_FILE;
            if (!is_array($baseline)) {
                $baseline = [];
            }
        }

        $report = $this->cleanReportFromBaseline($baseline, $report);

        return $this->fullReport->generateFileReport($report, $phpcsFile, $showSources, $width);
    }

    public function generate($cachedData, $totalFiles, $totalErrors, $totalWarnings, $totalFixable, $showSources = false, $width = 80, $interactive = false, $toScreen = true): void {
        // `Full` report only uses $cachedData
        $this->fullReport->generate($cachedData, $totalFiles, $totalErrors, $totalWarnings, $totalFixable, $showSources, $width, $interactive, $toScreen);
    }
}

Usable like: ./vendor/bin/phpcs '--report-App\Dev\PHP_CodeSniffer\GenerateBaselineReport'.

And to generate the baseline

<?php declare(strict_types=1);

use PHP_CodeSniffer\Files\File;
use PHP_CodeSniffer\Reports\Report;
use App\Serializer\PhpEncoder;

final class GenerateBaselineReport implements Report
{
    private static array $baseline = [];

    public function generateFileReport($report, File $phpcsFile, $showSources = false, $width = 80): bool
    {
        $filename = $report['filename'];

        /** @var array<string, int> $occurrences */
        $occurrences = [];
        $messages = &$report['messages'];
        foreach ($messages as $messageLine) {
            foreach ($messageLine as $messagesOff) {
                foreach ($messagesOff as $messageList) {
                    $occurrences[$messageList['source']] ??= 0;
                    $occurrences[$messageList['source']]++;
                }
            }
        }

        foreach ($occurrences as $source => $occurrence) {
            self::$baseline[$filename][$source]['occurrences'] = $occurrence;
        }

        return true;
    }

    public function generate(
        $cachedData,
        $totalFiles,
        $totalErrors,
        $totalWarnings,
        $totalFixable,
        $showSources = false,
        $width = 80,
        $interactive = false,
        $toScreen = true,
    ): void {
        global $argv;

        $phpEncoder = new PhpEncoder();
        $out = $phpEncoder->encode(self::$baseline, PhpEncoder::FORMAT, [
            'strict' => true,
        ]);

        $inputString = implode(' ', $argv);
        $out = str_replace('return ', "// Generated with: $inputString\nreturn ", $out);

        echo $out;
    }
}

Usable like: ./vendor/bin/phpcs --report-App\\Dev\\PHP_CodeSniffer\\GenerateBaselineReport --sniffs=SlevomatCodingStandard.TypeHints.ReturnTypeHint,SlevomatCodingStandard.TypeHints.DeclareStrictTypes > app/Dev/baseline.php

BafS avatar Nov 01 '22 09:11 BafS

@frankdekker it would be great to also have notices for false positives. See my explanation here; https://github.com/123inkt/php-codesniffer-baseline/issues/7

PHPStan also does this, it's called "unmatched ignored errors", see https://phpstan.org/user-guide/baseline#generating-the-baseline

@gsherwood what can we do for you to move this PR forward? Maybe @jrfnl can take a look since for some reason primarily Dutch people are active in this PR? :smiley:

peterjaap avatar Nov 20 '22 08:11 peterjaap

@BafS I wanted to give your solution a whirl, but I'm missing the PhpEncoder dep. Judging from the method signature, it's not https://github.com/Riimu/Kit-PHPEncoder?

peterjaap avatar Nov 20 '22 09:11 peterjaap

@peterjaap right, it's from Symfony\Component\Serializer\Encoder\EncoderInterface but with my own implementation. I edited my answer to use the native var_export function.

BafS avatar Nov 20 '22 16:11 BafS

Hi to all, thanks for this PR and the discussion in it. Like you, we also need a baseline and better ability to ignore valid errors. I look over all your solutions and prepared a package, that works in a similar way as PHPStan (also with outdated info). There must still be some magic and hacking, but it works without hard changing files in the vendor. We're using this for a couple of weeks in our production CI and everything looks good. You can find it here https://github.com/forrest79/PHPCS-Ignores. I will be glad for your comment and improvement. Still, I'm hoping, that PHPCS adds something like this as a new feature :-)

forrest79 avatar Jan 27 '23 23:01 forrest79

Hi to all, thanks for this PR and the discussion in it. Like you, we also need a baseline and better ability to ignore valid errors. I look over all your solutions and prepared a package, that works in a similar way as PHPStan (also with outdated info). There must still be some magic and hacking, but it works without hard changing files in the vendor. We're using this for a couple of weeks in our production CI and everything looks good. You can find it here https://github.com/forrest79/PHPCS-Ignores. I will be glad for your comment and improvement. Still, I'm hoping, that PHPCS adds something like this as a new feature :-)

Hihi, also nice trick to hook into PHPCS reporting. And same here, still hoping PHPCS comes with a build-in implementation :).

frankdekker avatar Jan 28 '23 10:01 frankdekker

This is an interesting approach: https://github.com/isaaceindhoven/php-code-sniffer-baseliner

Alternatives:

  • https://github.com/123inkt/php-codesniffer-baseline
  • https://github.com/forrest79/PHPCS-Ignores

Morgy93 avatar Sep 06 '23 14:09 Morgy93