CodeIgniter icon indicating copy to clipboard operation
CodeIgniter copied to clipboard

word_wrap in Email.php (library) does not handle multibyte characters well

Open williamli opened this issue 12 years ago • 5 comments

word_wrap in Email.php (library) does not handle multibyte characters well. It made long line of plaintext Chinese characters look weird when it cuts multibyte characters in the middle.

williamli avatar Apr 23 '12 19:04 williamli

Could you try something out?

Go here http://php.net/manual/en/function.wordwrap.php and check out the contributed function iconv_wordwrap by "mail at dasprids dot de".

Put the same Chinese characters through that string, then let us know how it goes.

philsturgeon avatar Apr 26 '12 20:04 philsturgeon

Sure. I will give it a try this weekend. Hope it is not too late.

williamli avatar Apr 26 '12 20:04 williamli

Some improvements have been made, but it's still not flawless ... I'm not sure it can ever be. If anybody has feedback - please comment.

narfbg avatar Nov 04 '14 09:11 narfbg

So let me preface this by saying that my C knowledge is (almost) nonexistent.

PHP's wordwrap function uses strncmp, which operates on char's, which are 1 byte, as opposed to wstrcmp which can handle multi-byte characters... so the C code behind the wordwrap function sees your multi-byte character as 2 separate characters.

This can be fixed in 1 of 2 ways:

  1. Re-writing the word_wrap function to not use php's wordwrap() and instead write our own version. This will be slower, but should work if we use mb_strlen() for calculations instead of strlen().

  2. Add a mb_wordwrap() function to PHP, and then use that instead.

I don't have time to see which function works properly, but the wordwrap() function in the Email class could possibly be replaced by something here: http://stackoverflow.com/questions/3825226/multi-byte-safe-wordwrap-function-for-utf-8

drew-hoffman avatar Feb 25 '15 21:02 drew-hoffman

Adding a multibyte wordwrap function to PHP is probably a bit beyond the scope of this project.

However, there are several examples of multibyte wordwrap functions at http://stackoverflow.com/questions/3825226/multi-byte-safe-wordwrap-function-for-utf-8

Any of those could be adapted/extended/coopted.

drivingmenuts avatar Feb 25 '15 21:02 drivingmenuts