python-stringcase icon indicating copy to clipboard operation
python-stringcase copied to clipboard

snakecase("HTTPResponse") produces "h_t_t_p_response"

Open lepsch opened this issue 8 years ago • 4 comments

I think acronyms should be converted to one "word" only, eg. HTTPResponse should be converted to http_response.

lepsch avatar Jun 07 '17 16:06 lepsch

While that may make perfect sense to us humans this is a very hard problem to solve without including a large acronym lookup dictionary for all the different acronyms that should be treated differently from the rules.

Without a lookup dictionary there is absolutely no difference between "AAAAAaaaaaaa" and "HTTPResponse".

kenodegard avatar Jul 06 '17 16:07 kenodegard

@njalerikson @okunishinishi

is a very hard problem to solve without including a large acronym lookup dictionary for all the different acronyms

It should not be difficult to provide such a lookup dictionary, by just adding a sensible collection of common acronyms (embedded in code) to the library. Such not be too difficult to find. Still, the better solution should be to let the user pass his own list of acronyms.

More intricate is the actual implementation. Certainly, can be done via regex and groups. Maybe stringcase could be restructured as a class with the current functions as (class)methods. A list of strings could be passed to the init on instantiation._

A non-regex approach for finding acronyms in the context of string case conversions can be found here: https://github.com/jdc0589/CaseConversion/blob/master/case_parse.py

Essentially this method returns a list of words in PascalCase. These words can then be combined to give various cases. It should be easy to implement. The method either takes a list of strings as predefined acronyms (e.g. ["HTTP", "FTP"]) and if no such list is given has fallback method. This fallback method is not working with regex, as the one in the comment below. In case @okunishinishi wants to extend stringcase it is better to replace that fallback method with a pure regex approach.

pykong avatar Aug 04 '17 23:08 pykong

Here is a pure regex approach, which does not mince runs of uppercase word in the fashion, as described. It also does not rely on a lookup dict. It would be trivial to implement.

https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case

pykong avatar Aug 05 '17 21:08 pykong

This new package offers acronyme detection. https://github.com/AlejandroFrias/case-conversion

pykong avatar Mar 11 '18 17:03 pykong