Capitalized names have strange behavior
Found some bugs having to do with capitalized names and multiple middle names. Sharing some examples here, it seems the initials get incorporated somehow.
Names for testing:
- SOFIA GARCIA DE LA MANCHA
- DA LAT
- JUANITA MARIA DE SUR
- Garcia Marques, Gabriel
The last example should be listed as "Last, First" so "Gabriel" should just be the first name and "Garcia Marques" should be listed as the last name.
Output:
(
[firstname] => Sofia
[middlename] => Garcia
[initials] => D E L A
[lastname] => Mancha
)
(
[firstname] => D
[initials] => A
[lastname] => Lat
)
(
[firstname] => Juanita
[middlename] => Maria
[initials] => D E
[lastname] => Sur
)
(
[firstname] => Garcia Gabriel
[lastname] => Marques
)
<?php
require_once __DIR__ . '/vendor/autoload.php';
$parser = new TheIconic\NameParser\Parser();
$namesToTest = array( 'SOFIA GARCIA DE LA MANCHA', 'DA LAT', 'JUANITA MARIA DE SUR', 'Garcia Marques, Gabriel' );
foreach ( $namesToTest as $input ) {
$name = $parser->parse( $input );
echo $name->getSalutation();
echo $name->getFirstname();
echo $name->getLastname();
echo $name->getMiddlename();
echo $name->getNickname();
echo $name->getInitials();
echo $name->getSuffix();
print_r( $name->getAll() );
echo $name;
}
Btw, one potential "fix" for the capitalized names is this:
if ( $input === strtoupper( $input ) ) {
$input = ucwords( strtolower( $input ) );
}
@dxdc have you tried adjusting the combined initials setting?
good call @luads!
$parser->setMaxCombinedInitials(1);
achieves the same result as:
if ( $input === strtoupper( $input ) ) {
$input = ucwords( strtolower( $input ) );
}
It might be worth (?) temporarily overriding any user-defined setting for setMaxCombinedInitials and using 1 for any case where $input === strtoupper( $input )... just a thought.
So, that leaves just examples like:
- Garcia Marques, Gabriel
- Pérez Quiñones, Manuel A.
This kind of "two surname" structure is common in Mexican last names, for example. Not sure the best way to handle it... although maybe it should be treated the same way as the name would be without the comma, i.e.
- Gabriel Garcia Marques
- Manuel A. Pérez Quiñones