commons-text
commons-text copied to clipboard
Apache Commons Text
Changes similar to what was recently released in commons-configuration 2.8.0. - Make default string lookups configurable via system property. - Remove dns, url, and script lookups from defaults.
Hello ! This time a much larger pull request ^^ I added to the EntityArray class the HTML 5.0 Entities. Here is how I produced the feature: - I got...
…inRange and selectFrom to let method chaining is meaningful ```java char[][] CHAR_PAIRS = {{'0', '9'}, {'A', 'Z'}, {'a', 'z'}}; GENERATOR = new RandomStringGenerator.Builder() .withinRange(CHAR_PAIRS) .selectFrom(',', ';', ':', '?', '*', '#',...
Hello, This a quick bugfix on the NumericEntityUnescaper. The bug allows decimal characters entities without semi-colon and followed by a letter from A to F to be ignored by the...
This pull request adds the Sorensen-Dice Similarity Algorithm to Apache Commons Text. The Sorensen–Dice coefficient is a statistic used for comparing the similarity of two samples. It was independently developed...
Currently, if given text like so: ``` "arma virumque cano…" "“bread and circuses”" ``` `StringEscapeUtils` currently returns the corresponding Unicode characters for points 128, 133, 147, and 148, which are...
`StringSubstitutor#substitute` would throw `StringIndexOutOfBoundsException` when removing a variable start marker if `StringSubstitutor` wasn't using the internal representation of `TextStringBuilder#buffer`. The local variable, `priorVariables`, contains unbalanced squirrely parentheses (replacing `$${${TEST}}` would...
Text 158
Fix Jaccard similarity with empty strings: Due to bug, Jaccard similarity for two empty strings returned 0.0, and the distance 1.0, now it returns similarity 1.0, and the distance 0.0
…imiters is null