forceutf8 icon indicating copy to clipboard operation
forceutf8 copied to clipboard

Not all words are converted

Open turgic opened this issue 6 years ago • 2 comments

I have 2 files : File has one row with : magn�sienne Second file has one row with magn�sienne The results for one document is :
magnésienne For other : magn?sienne Implimentation : $return[] = $this->utf8Encoding($row);

private function utf8Encoding($datas) { foreach ($datas as $key => $data) { $datas[$key] = Encoding::fixUTF8($data); } return $datas; } Have you an idea ? Thx in advance

turgic avatar Mar 15 '19 12:03 turgic

Here's an explanation of that special character:

U+FFFD � REPLACEMENT CHARACTER used to replace an unknown, unrecognized or unrepresentable character

So if your data source has already replaced the original character with that dummy character, there may be no way to get it back.

millenniumtree avatar May 24 '19 20:05 millenniumtree

Although those two lines may appear the same in your text editor, the underlying bytes may not be the same. The latter character could be the literal U+FFFD question mark, while the first one could be some other thing that your editor doesn't recognize and thus replaces it visually with the question mark to indicate that fact.

garrettw avatar May 24 '19 20:05 garrettw