doc-en icon indicating copy to clipboard operation
doc-en copied to clipboard

mb_convert_case MB_CASE_TITLE does not actually do Title Case

Open leeksoup opened this issue 1 year ago • 3 comments

Description

The following code:

<?php
$string = "Mrs. Frisby and the Rats of NIMH";
echo mb_convert_case($string, MB_CASE_TITLE);

Resulted in this output:

Mrs. Frisby And The Rats Of Nimh

But I expected this output instead:

Mrs. Frisby and the Rats of NIMH

The function uses the constant MB_CASE_TITLE and the documentation states:

MB_CASE_TITLE (int) Performs a full title-case conversion based on the Cased and CaseIgnorable derived Unicode properties. In particular this improves handling of quotes and apostrophes. This may change the length of the string.

In the case of my simple example, the function is simply upper-casing each word in the title, and changing all others to lowercase. For proper title casing, the words "and", "the" and "of" should remain lower case. Ideally, the acronym NIMH should remain all caps, but that is a much smaller issue.

PHP Version

PHP 8.2.16

Operating System

Fedora 39

leeksoup avatar Feb 25 '24 03:02 leeksoup

In the case of my simple example, the function is simply upper-casing each word in the title, and changing all others to lowercase.

Exactly.

For proper title casing, the words "and", "the" and "of" should remain lower case.

According to some rules of capitalization, yes. But that's not what MB_CASE_TITLE does - it's too sophisticated a requirement for a library devoted to handling character encodings. What you're looking for would have to be something powered by a database of (English) words and casing rules.

Ideally, the acronym NIMH should remain all caps, but that is a much smaller issue.

A reasonable expectation, but again, that's not what MB_CASE_TITLE does: it manages case for everything in the string, not just the first letters of words.

damianwadley avatar Feb 25 '24 04:02 damianwadley

https://3v4l.org/2h7Zq

At least, The behavior of mb_convert_case has been unified from past versions. In the first place, mb_convert_case can't understand context of the sentence.

I seems correct behavior of mb_convert_case at now.

youkidearitai avatar Feb 25 '24 04:02 youkidearitai

OK, I can accept that it's too complex and will write my own function.

But then is it possible to make the flag constant (or at least the documentation) say explicitly that it's capitalizing every word? Describing it as "title case" is confusing, since by definition that would imply a "title" context.

Thank you.

leeksoup avatar Feb 25 '24 04:02 leeksoup