name-parser icon indicating copy to clipboard operation
name-parser copied to clipboard

Capitalized names have strange behavior

Open dxdc opened this issue 5 years ago • 3 comments

Found some bugs having to do with capitalized names and multiple middle names. Sharing some examples here, it seems the initials get incorporated somehow.

Names for testing:

  • SOFIA GARCIA DE LA MANCHA
  • DA LAT
  • JUANITA MARIA DE SUR
  • Garcia Marques, Gabriel

The last example should be listed as "Last, First" so "Gabriel" should just be the first name and "Garcia Marques" should be listed as the last name.

Output:

(
    [firstname] => Sofia
    [middlename] => Garcia
    [initials] => D E L A
    [lastname] => Mancha
)
(
    [firstname] => D
    [initials] => A
    [lastname] => Lat
)
(
    [firstname] => Juanita
    [middlename] => Maria
    [initials] => D E
    [lastname] => Sur
)
(
    [firstname] => Garcia Gabriel
    [lastname] => Marques
)
<?php

require_once __DIR__ . '/vendor/autoload.php';

$parser = new TheIconic\NameParser\Parser();
$namesToTest = array( 'SOFIA GARCIA DE LA MANCHA', 'DA LAT', 'JUANITA MARIA DE SUR', 'Garcia Marques, Gabriel' );

foreach ( $namesToTest as $input ) {

    $name = $parser->parse( $input );
    echo $name->getSalutation();
    echo $name->getFirstname();
    echo $name->getLastname();
    echo $name->getMiddlename();
    echo $name->getNickname();
    echo $name->getInitials();
    echo $name->getSuffix();

    print_r( $name->getAll() );

    echo $name;
}

dxdc avatar Jul 21 '20 02:07 dxdc

Btw, one potential "fix" for the capitalized names is this:

    if ( $input === strtoupper( $input ) ) {
        $input = ucwords( strtolower( $input ) );
    }

dxdc avatar Jul 21 '20 02:07 dxdc

@dxdc have you tried adjusting the combined initials setting?

luads avatar Jul 21 '20 03:07 luads

good call @luads!

$parser->setMaxCombinedInitials(1);

achieves the same result as:

    if ( $input === strtoupper( $input ) ) {
        $input = ucwords( strtolower( $input ) );
    }

It might be worth (?) temporarily overriding any user-defined setting for setMaxCombinedInitials and using 1 for any case where $input === strtoupper( $input )... just a thought.

So, that leaves just examples like:

  • Garcia Marques, Gabriel
  • Pérez Quiñones, Manuel A.

This kind of "two surname" structure is common in Mexican last names, for example. Not sure the best way to handle it... although maybe it should be treated the same way as the name would be without the comma, i.e.

  • Gabriel Garcia Marques
  • Manuel A. Pérez Quiñones

dxdc avatar Jul 21 '20 03:07 dxdc