Phalanger icon indicating copy to clipboard operation
Phalanger copied to clipboard

mbstring serialize/unserialize incorrect behaviour compared to PHP

Open lucyllewy opened this issue 8 years ago • 1 comments

serializing a multibyte character and then unserializing it again in Phalanger causes character to change when echoed as shown in the testcase below.

Phalanger behaves differently to PHP in this respect:

  • PHP will return é in both echo attempts with $val1 and $val2 respectively.
  • Phalanger will return é from $val1, but ? from $val2..
<?php
$val1 = 'é';
$val2 = unserialize(serialize($val1));

$EOL = "<br>\n";

// let's test equality of the supposedly equal characters
echo ($val1 === $val2 ? 'Equal' : 'Not Equal') . $EOL; // Returns 'Equal', because $val1 === $val2 === 'é'
echo htmlentities($val1) . $EOL; // Returns html entity for 'é'
echo htmlentities($val2) . $EOL; // Returns '?' (Note that $val1 === $val2 !)

lucyllewy avatar Oct 08 '16 02:10 lucyllewy

serialize/unserialize translates unicode strings according to current PageEncoding which is set to default windows culture by default. I would always recommend to set PageEncoding to UTF-8 to avoid all these issues.

jakubmisek avatar Oct 15 '16 07:10 jakubmisek