commons-text icon indicating copy to clipboard operation
commons-text copied to clipboard

New and useful methods for commons.text.CaseUtils

Open speters33w opened this issue 10 months ago • 2 comments

This adds several methods to CaseUtils, that allow a user to convert a string to a variety of cases. These methods normalize the strings to ANSI Latin. I've created new tests from the existing tests for toCamelCase(String, boolean, char[]) and run them, all the output is as expected. The main driver engine for the new methods is in toTitleCase(). This is what you get...

 *                              "Two words" "foo bar" "Piñata Café"
 * toCamelCase(str)             "twoWords"  "fooBar"  "pinataCafe"
 * toCamelCase(str, false, " ") "twoWords"  "fooBar"  "piñataCafé"
 * toCamelCase(str, true, " ")  "TwoWords"  "FooBar"  "PiñataCafé"
 * ToCamelSnakeCase             "two_Words" "foo_Bar" "pinata_Cafe"
 * toFlatcase(str)              "twowords"  "foobar"  "pinatacafe"
 * toKebabCase(str)             "two-words" "foo-bar" "pinata-cafe"
 * toScreamingCase(str)         "TWOWORDS"  "FOOBAR"  "PINATACAFE"
 * toScreamingKebabCase(str)    "TWO-WORDS" "FOO-BAR" "PINATA-CAFE"
 * toScreamingSnakeCase(str)    "TWO_WORDS" "FOO_BAR" "PINATA_CAFE"
 * toSnakeCase(str)             "two_words" "foo_bar" "pinata_cafe"
 * toTitleCase(str)             "Two_Words" "Foo_Bar" "Pinata_Cafe"
 * toTrainCase(str)             "Two-Words" "Foo-Bar" "Pinata-Cafe"
 * toUpperCamelCase(str)        "TwoWords"  "FooBar"  "PinataCafe"

speters33w avatar Apr 09 '24 22:04 speters33w

You make some good points, to have at most one or two methods with a parameter for delimeter. I will revisit this.

One concern is the existing char[] for delimeter in the existing toCamelCase method is to define delimiters to exclude, where my methods would use a char or string for character to use between words.

I will definately revisit this and do a new push to my fork.

Thank you.

On Wed, Apr 10, 2024, 09:35 Matt Benson @.***> wrote:

@.**** commented on this pull request.

I am missing the rationale for embedding the stripping of accent chars into the notion of converting to title case, or any other conversion. You are already delegating to a utility method for this purpose; couldn't the API consumer do that it needed? I would also prefer a design more like one common implementation with parameters for delimiter, case(, etc.?) rather than e.g. kebab = snake + s/_/-/ .

Additionally, I have a preference for "Pascal" to "upper camel" for the seeming uniformity of having a single word to describe the desired structure, notwithstanding combinations like "screaming snake."

— Reply to this email directly, view it on GitHub https://github.com/apache/commons-text/pull/528#pullrequestreview-1991714141, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF72ZOZZWHFOZSZAMUOFPZDY4U52LAVCNFSM6AAAAABF7LF7D6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSOJRG4YTIMJUGE . You are receiving this because you authored the thread.Message ID: @.***>

speters33w avatar Apr 10 '24 19:04 speters33w

I've rewritten the code in CaseUtils based on comments and some more aggressive tests I devised. I've removed many unnecessary methods such as toScreamingCase, etc. that can be easily reproduced by a user. I changed the main engine to toDelimitedCase, which accepts a boolean to uncapitalize the very first letter of the string, and a "separator" (to differentiate it from delimiters in the existing CamelCase()). I retained my toCamelCase(str), toPascalCase, toSnakeCase, and toKebabCase even though these could be reproduced by a user using toDelimitedCase to make it easier for a user. I did add some todos thatI haven't completed yet, precompiling regex patterns, reducing cyclomatic complexity in toDelimitedCase, and testing to see if I should replace my instance of StringBuilder to commons.text.TextStringBuilder.

I also wrote some very aggressive tests with null values for all parameters, line breaks, tabs, etc.

https://github.com/speters33w/commons-text/commit/5972912f9613fbdd1360c5bc3f5fe175e05ab973

This is what you get in the revised version:

 *                                                           "Two words" "foo bar" "Piñata Café"
 * camelCase        toCamelCase(str)                         "twoWords"  "fooBar"  "pinataCafe"
 * camelCase        toCamelCase(str, false, " ")             "twoWords"  "fooBar"  "piñataCafé"
 * camel_Snake      toDelimitedCase(str, false, '_')         "two_Words" "foo_Bar" "pinata_Cafe"
 * flatcase         toPascalCase(str).toLowerCase()          "twowords"  "foobar"  "pinatacafe"
 * kebab-case       toKebabCase(str)                         "two-words" "foo-bar" "pinata-cafe"
 * PascalCase       toPascalCase(str)                        "TwoWords"  "FooBar"  "PinataCafe"
 * PascalCase       toCamelCase(str, true, " ")              "TwoWords"  "FooBar"  "PiñataCafé"
 * SCREAMINGCASE    toPascalCase(str).toUpperCase()          "TWOWORDS"  "FOOBAR"  "PINATACAFE"
 * SCREAMING-KEBAB  toDelimitedCase(str, '-').toUpperCase()  "TWO-WORDS" "FOO-BAR" "PINATA-CAFE"
 * SCREAMING_SNAKE  toDelimitedCase(str, '_').toUpperCase()  "TWO_WORDS" "FOO_BAR" "PINATA_CAFE"
 * snake_case       toSnakeCase(str)                         "two_words" "foo_bar" "pinata_cafe"
 * Title_Case       toDelimitedCase(str, '_')                "Two_Words" "Foo_Bar" "Pinata_Cafe"
 * Train-Case       toDelimitedCase(str, '-')                "Two-Words" "Foo-Bar" "Pinata-Cafe"

speters33w avatar Apr 15 '24 15:04 speters33w