cleanco icon indicating copy to clipboard operation
cleanco copied to clipboard

Incorrect detection of "Pty Limited" Suffix

Open Sir-Onion opened this issue 5 years ago • 3 comments

>>> cleanco("Example Example Pty Ltd").clean_name() # CORRECT
'Example Example'
>>> cleanco("Example Example Pty Limited").clean_name() # Not so good
'Example Example Pty'

The give you a view on the scope of the problem: I'm working to normalise a database of around on processing a database of around 900k company names which have been typed into an application over a 10 year period. The database contains primarily companies from anglophone countries. Of these, around 580 have a company name like this.

Do you see this as a problem also? If so, I'm happy to put together a patch.

Sir-Onion avatar Feb 28 '20 11:02 Sir-Onion

Thank you. I did a quick google on the topic and this seems valid. Please, a github PR is welcomed if you can submit one.

petri avatar Apr 16 '20 20:04 petri

@tubasal is "pty ltd" (or "pty limited") its own legal form or is this suffix just a concatenation of two different suffixes? You can get rid of multiple suffixes by running the removal twice.

petri avatar Apr 25 '20 15:04 petri

I took a look at the term definitions. We don't have pty as a separate term, nor do we have pty limited. So this cannot work. Presuming the work on using ISO standard 20275 bears fruit, this issue might become fixed by improved term definitions that the standard provides. On the other hand, it's possible that the term definitions there might fall short the same way as here.

petri avatar Apr 26 '20 15:04 petri