python-nameparser icon indicating copy to clipboard operation
python-nameparser copied to clipboard

Initials Formatting

Open waylan opened this issue 1 year ago • 0 comments

I wanted to remove any extraneous characters from the initials and only have the initials with no punctuation or whitespace. In the process I stumbled upon two shortcomings with the formatting of initials.

Setting initials delimiter to empty string

>>> from nameparser import HumanName
>>> HumanName('Doe, John A.').initials()
'J. A. D.'
> >> HumanName('Doe, John A.', initials_delimiter='').initials()
'J. A. D.'                                                                 <=  EXPECTED 'J A D'
>>> from nameparser.config import CONSTANTS
>>> CONSTANTS.initials_delimiter = ''
>>> HumanName('Doe, John A.').initials()
'J A D'
>>> HumanName('Doe, John A.', initials_format='{first}{middle}{last}').initials()
'JAD'

It seems that while one can set the inititals_delimiter to an empty string via the CONSTANT, it is not possible via the keyword on HumanName. Presumably, this is because an empty string evaluates to False here:

https://github.com/derek73/python-nameparser/blob/759a1316f2fda4395714f36d777fd014dcdd51b0/nameparser/parser.py#L99

I would expect this could be fixed by changing that line to:

self.initials_delimiter = initials_delimiter if initials_delimiter is not None else self.C.initials_delimiter

Removing all whitespace from initials is not possible with multi-part names.

>>> from nameparser import HumanName
>>> from nameparser.config import CONSTANTS
>>> CONSTANTS.initials_delimiter = ''
>>> HumanName('Doe, John A. Kenneth', initials_format='{first}{middle}{last}').initials()
'JA KD'                                                                  <=  EXPECTED 'JAKD'
>>> HumanName('Doe, John A. Kenneth', initials_delimiter='.', initials_format='{first}{middle}{last}').initials()
'J.A. K.D.'                                                              <=  EXPECTED 'JAKD'

This one is not so easy to fix. The code joins the parts together with a space hard-coded in.

https://github.com/derek73/python-nameparser/blob/759a1316f2fda4395714f36d777fd014dcdd51b0/nameparser/parser.py#L270-L277

You could require the space to be part of the delimiter, but that might result in weird output for certain formats (i.e., {last}, {first} {middle}) and it would be a backward incompatible change for anyone who has already defined custom delimiters. Maybe another setting needs to be defined for this. Although, I have no idea what name to give it.

In the end, I worked around both issues with ''.join(name.initials_list()), but it would be nice to be able to have full control with the provided formatting options.

waylan avatar Apr 05 '24 20:04 waylan