cleanco icon indicating copy to clipboard operation
cleanco copied to clipboard

Base name is not working with some names

Open sandeepnatoo opened this issue 3 years ago • 4 comments

I checked some of the scenarios where basename function giving empty result. from cleanco import basename print("Base name name for {} : {}".format('IKS APS', basename("IKS APS"))) print("Base name name for {} : {}".format('S.C.S & COMPANY', basename("S.C.S & COMPANY"))) print("Base name name for {} : {}".format('COOP', basename("COOP")))

sandeepnatoo avatar Jan 12 '22 16:01 sandeepnatoo

Yes, the point of basename is removing common suffixes, prefixes etc. to leave just the base name. You're basically giving those suffixes/prefixes there, or combinations of them. What is the problem you're having with this? Are those actual company names that you try to normalize?

petri avatar Feb 09 '22 18:02 petri

Coop is a Dutch supermarket (full name: 'Coop Supermarkten BV', but the full name actually works fine). And indeed, the basename of Coop is "" (empty string). Same for SCS, it's a key in "Limited" (dict terms_by_type). Where the full name 'SCS Software s.r.o.' also works just fine.

I think the code, in the last iteration removing things, if it finds that it has to remove everything, there must be a way to recover the iteration before that. (but maybe not by default, because it's actually handy to remove multiple terms). Of course, this check can be done at the userside too, and should at least be mentioned in the readme/documentation.

FBnil avatar Aug 16 '22 16:08 FBnil

Coop is a Dutch supermarket (full name: 'Coop Supermarkten BV', but the full name actually works fine). And indeed, the basename of Coop is "" (empty string). Same for SCS, it's a key in "Limited" (dict terms_by_type). Where the full name 'SCS Software s.r.o.' also works just fine.

I think the code, in the last iteration removing things, if it finds that it has to remove everything, there must be a way to recover the iteration before that. (but maybe not by default, because it's actually handy to remove multiple terms). Of course, this check can be done at the userside too, and should at least be mentioned in the readme/documentation.

Yes, agree with you

sandeepnatoo avatar Aug 16 '22 16:08 sandeepnatoo

Yes, the point of basename is removing common suffixes, prefixes etc. to leave just the base name. You're basically giving those suffixes/prefixes there, or combinations of them. What is the problem you're having with this? Are those actual company names that you try to normalize?

Yes, these are the some of the organization names I came across.

sandeepnatoo avatar Aug 16 '22 17:08 sandeepnatoo