python-iso3166 icon indicating copy to clipboard operation
python-iso3166 copied to clipboard

Enhancement: Get country by unambiguous name substring

Open csindle opened this issue 4 years ago • 0 comments

Get the unambiguous matching Country from a partial name.

E.g.


from iso3166 import countries

>>> countries.get('Bolivia')
Country(name='Bolivia, Plurinational State of', alpha2='BO', alpha3='BOL', numeric='068', apolitical_name='Bolivia, Plurinational State of')

>>> iso3166.countries.get("Moldova")
Country(name='Moldova, Republic of', alpha2='MD', alpha3='MDA', numeric='498', apolitical_name='Moldova, Republic of')

More cases that would now return the expected Country:

  • United Kingdom
  • Russia
  • Syria
  • Iran
  • Macedonia (Actually "North Macedonia", so simple prefix /.startswith would not suffice)

This enhancement partially solves https://github.com/deactivated/python-iso3166/issues/21, though not the specific "South Korea" case.

Independent code to explain behaviour:

from iso3166 import countries, Country


def sub_get(partial_name: str) -> Country or None:
    """
    Get the single matching Country from a partial name.
    partial_name:  The country name, or sub-string thereof, to find.
    Return:  None, or the fuzzy matching country name.
    """
    name = partial_name.lower()
    country = None
    for key in INDEX:
        if name in key:    ###   Crux   ###
            if country is not None:
                # Ambiguous partial_name
                raise KeyError
            country = INDEX[key]

    return country


INDEX = {c.name.lower(): c for c in countries}

tests = (
    'United Kingdom', 'Iran', 'RUSSIA', 'spam', 'Macedonia',  # Passes
    'New', 'south', ' ',                                      # Raises KeyErrors
    )

for test in tests:
    print(sub_get(test))

Since all of the country name have to be searched through, this is certainly slower than a dictionary lookup.

csindle avatar Oct 12 '21 08:10 csindle