doc-en icon indicating copy to clipboard operation
doc-en copied to clipboard

Unexpected return values: strcmp(), strncmp(), strcasecmp(), strncasecmp

Open sheinrich177 opened this issue 1 year ago • 5 comments

Description

Disclaimer: This is my first bug report, I researched but didn't find anything related. Please request if any further information is required or close this report if it isn't valid.

Description:

During PHP 8.1 to 8.3 update check for a project I found change information regarding return values of strcmp(), strncmp(), stcasecmp() and strncasecmp() in PHP 8.2:

  • https://www.php.net/manual/en/migration82.other-changes.php#migration82.other-changes.functions.core

The new return values are also mentioned on each of the documentation pages.

Expected behavior:

  • return values: 0, -1 and 1

Actual behavior:

  • different return values (see below)

Environment:

  • Linux Mint 21.3 Virginia
  • different PHP versions by: ppa:ondrej/php

test.php:

<?php

    var_dump(
            strcmp('bar', 'foo'),
            strcmp('foo', 'bar'),
            strcasecmp('bar', 'foo'),
            strcasecmp('foo', 'bar'),
            strncmp('bar', 'foo', 2),
            strncmp('foo', 'bar', 2),
            strncasecmp('bar', 'foo', 2),
            strncasecmp('foo', 'bar', 2),
    );

Output:

  • 8.1:
$ /usr/bin/php8.1 -v && /usr/bin/php8.1 -f test.php 

PHP 8.1.27 (cli) (built: Dec 21 2023 20:19:54) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.27, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.27, Copyright (c), by Zend Technologies
    
int(-4)
int(4)
int(-4)
int(4)
int(-4)
int(4)
int(-4)
int(4)
  • 8.2:
$ /usr/bin/php8.2 -v && /usr/bin/php8.2 -f test.php 

PHP 8.2.15 (cli) (built: Jan 20 2024 14:17:05) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.15, Copyright (c) Zend Technologies
    with Zend OPcache v8.2.15, Copyright (c), by Zend Technologies

int(-4)
int(4)
int(-4)
int(4)
int(-4)
int(4)
int(-4)
int(4)
  • 8.3:
$ /usr/bin/php8.3 -v && /usr/bin/php8.3 -f test.php 

PHP 8.3.3-1+ubuntu22.04.1+deb.sury.org+1 (cli) (built: Feb 15 2024 18:38:52) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.3.3, Copyright (c) Zend Technologies
    with Zend OPcache v8.3.3-1+ubuntu22.04.1+deb.sury.org+1, Copyright (c), by Zend Technologies

int(-4)
int(4)
int(-4)
int(4)
int(-4)
int(4)
int(-4)
int(4)

Additionally (better: confusingly), I get a different output here: https://3v4l.org/C1i18 (please check current 8.2 and 8.3 versions separately - all supported versions include version 8.1).

int(-265725)
int(265725)
int(-4)
int(4)
int(-265742)
int(265742)
int(-4)
int(4)

PHP Version

PHP 8.2 / PHP 8.3

Operating System

Linux Mint 21.3

sheinrich177 avatar Feb 27 '24 16:02 sheinrich177

I'm confused about the documentation note. I found 0bc9241334eaebd737eb0455e9739257e74d0713, which introduced the wrong note...

What changed is the return value when lengths differ. In that case there was a bugfix in php/php-src#8220 so that it's now guaranteed to be smaller than 0 (-1), 0 or greater than 0 (1) if the length difference of two strings is greater than 2 GB.

The behaviour of strcmp & co is unchanged for strings where lengths match.

General question: Do we fix wrong notes in UPGRADING after the release?

bwoebi avatar Feb 27 '24 17:02 bwoebi

What changed is the return value when lengths differ.

Except that's not quite true either. https://3v4l.org/GUuA5

strcmp uses zend_binary_strcmp, whose relevant bit is

	retval = memcmp(s1, s2, MIN(len1, len2));
	if (!retval) {
		return ZEND_THREEWAY_COMPARE(len1, len2);
	} else {
		return retval;
	}

It will memcmp using the smaller length of the two strings (comparing a "prefix", so to speak) and if they're different then it returns whatever that value is - which is only guaranteed to be negative/positive [1]. If the two prefixes are the same, only then will it go to ZEND_THREEWAY_COMPARE, and that's the part which explicitly returns -1/0/+1 based on which string is shorter. https://3v4l.org/mT8di

So the implementation of the "always returns -1/0/+1" feature missed out on normalizing memcmp's return value.


[1] On Windows, I always get -1/+1. And 3v4l's live preview (which is WASM and currently only on PHP 8.1) also returns -1/+1. And it sounds like valgrind may also substitute its own memcmp that returns only -1/+1.

damianwadley avatar Feb 27 '24 18:02 damianwadley

Yeah, I wasn't 100% precise in specifying the exact circumstances. What I really wanted to say is that there was no intent in normalizing that value, just that the change fixes the integer overflow for size differences >2GB.

bwoebi avatar Feb 27 '24 18:02 bwoebi

@bwoebi I think it makes sense to fix the migration page in the docs. https://www.php.net/manual/en/migration82.other-changes.php#migration82.other-changes.functions.core

But more importantly, the docs themselves are also wrong.

https://www.php.net/manual/en/function.strcmp.php

Returns -1 if string1 is less than string2; 1 if string1 is greater than string2, and 0 if they are equal.

I'm moving this to the docs repo.

iluuu1994 avatar Feb 28 '24 13:02 iluuu1994

I checked again by using the official docker image from https://hub.docker.com/_/php on my Linux machine. I used the changed test script from https://3v4l.org/GUuA5 and only added the output of phpversion().

I get the same result as on my machine:

PHP: 8.2.16
int(-4)
int(4)
int(-4)
int(4)
int(-4)
int(4)
int(-4)
int(4)

The page 3v4l.org may has some issues by itself, maybe...

However, I also tested it on a different machine with Window 11 installed and with the packages:

  • php-8.2.16-Win32-vs16-x64.zip
  • php-8.2.16-nts-Win32-vs16-x64.zip

For both, comparing to my machine/Linux, I get a different, but regarding the return values -1, 0, 1 also "wrong" result:

int(-1)
int(1)
int(-4)
int(4)
int(-1)
int(1)
int(-4)
int(4)

So, I would assume that it isn't only a "wrong documentation" topic..

sheinrich177 avatar Feb 28 '24 14:02 sheinrich177

While there looks to be some dispute over whether this is a docs of core bug above; surely it can be removed from the docs for now?

This is quite a serious inaccuracy as the docs led me to believe code should target === -1|0|1.

ChrisHSandN avatar Jun 12 '24 10:06 ChrisHSandN