PHP_CodeSniffer PSR1 doesn't check for file encoding

The PSR1 standard stands that "Files MUST use only UTF-8 without BOM for PHP code". There is no check for files using other encodings than UTF-8. The existing sniff checks for BOM in the files. If a file is encoded with, for example. windows-1252 encoding and don't have BOM, the file check pass.

Steps to reproduce the behavior:

Create a file called test.php with any code and file encoding different than UTF-8 and without BOM
Run phpcs --standard=PSR1 test.php
No errors are showed regarding file encoding

Expected behavior

There should be an errors regarding the file enconding.


Operating System	Debian 11.7 Bullseye
PHP version	8.2.6
PHP_CodeSniffer version	3.7.2
Standard	PSR1, PSR2, PSR12
Install type	Composer local

[X] I have searched the issue list and am not opening a duplicate issue.
[X] I confirm that this bug is a bug in PHP_CodeSniffer and not in one of the external standards.
[X] I have verified the issue still exists in the master branch of PHP_CodeSniffer.

Jun 12 '23 17:06 lucraraujo

The Generic.Files.ByteOrderMark is only intended to check for the byte order mark, it does not check the file encoding, so that sniff is working correctly.

What I believe you are trying to report is that there is no sniff checking if files are encoded as UTF-8.

While I do believe it can be checked what files claim to be encoded as, I do not believe it is possible to reliably verify that that claim is actually correct. I may well be wrong though and/or reality may have superseded the research I did in a distant past when I looked into something like this before.

I'll mark this as a feature request for now and would be interested to hear if someone has found a way to do this.

Jun 12 '23 17:06 jrfnl

You're right. It's more a feature request than a bug.

Jun 12 '23 20:06 lucraraujo

PHP_CodeSniffer PHP_CodeSniffer copied to clipboard

PSR1 doesn't check for file encoding

Expected behavior

PHP_CodeSniffer
PHP_CodeSniffer copied to clipboard