ocaml.org icon indicating copy to clipboard operation
ocaml.org copied to clipboard

Use uucp caselesseq instead of structural equality and String.ascii_lowercase

Open cuihtlauac opened this issue 1 year ago • 2 comments

In the ocaml.org source code, strings are compared or searched, ignoring cases (i.e. in a case-insensitive manner). Most often, this is done using String.lowercase_ascii and either OCaml structural equality (=) or standard library functions such as String.sub or String.begins_with.

However, as @Octachron has noted, this is reckless. We'd better use robust, i18n-aware string functions from Uucp's library. Since this library is already part of what ocaml.org pulls, this does not create dependency considerations. See: https://github.com/ocaml/ocaml.org/pull/2442

There are several tasks involved here:

  1. [ ] Locate places where case-insensitive string comparison takes places
  2. [ ] Use Uucp functions to perform those comparisons
  3. [ ] Check no regression takes place

cuihtlauac avatar May 21 '24 14:05 cuihtlauac

ran some grep to find the files where case-insensitive string comparision takes place:

  1. String.sub
./global/import.ml:9:12:        if String.sub s1 i len = s2 then raise Exit
./string_uppercase:1:3:./string_sub:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub project.mentee 0 1); %>
./string_uppercase:2:3:./string_sub:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
./string_uppercase:3:3:./string_sub:./ocamlorg_frontend/pages/package_overview.eml:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
./string_uppercase:4:105:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub project.mentee 0 1); %>
./string_uppercase:5:105:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
./string_uppercase:6:82:./ocamlorg_frontend/pages/package_overview.eml:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
./ocamlorg_data/data.ml:120:14:          if String.sub s1 i len = s2 then raise Exit
./ocamlorg_frontend/pages/outreachy.eml:32:65:                                    <%s String.uppercase_ascii (String.sub project.mentee 0 1); %>
./ocamlorg_frontend/pages/outreachy.eml:43:65:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
./ocamlorg_frontend/pages/package_overview.eml:27:35:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
./ocamlorg_frontend/pages/home.eml:311:28:                      <%s! String.sub item.body 0 (min (String.length item.body - 1) 100) %>...
./ocamlorg_frontend/components/search.eml:24:49:      if content_length < length then text else String.sub text 0 length
./ocamlorg_frontend/components/search.eml:32:12:      else String.sub text (content_length - length) length
  1. String.uppercase_ascii
./string_sub:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
./string_sub:./ocamlorg_frontend/pages/package_overview.eml:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub project.mentee 0 1); %>
./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
./ocamlorg_frontend/pages/package_overview.eml:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
  1. String.lowercase_ascii
./ocamlorg_package/lib/ocamlorg_package.ml:543:15:    let str = String.lowercase_ascii str in
./ocamlorg_package/lib/ocamlorg_package.ml:561:31:  let match_ f s pattern = f (String.lowercase_ascii @@ s) pattern
./ocamlorg_web/lib/handler.ml:218:19:    let pattern = String.lowercase_ascii pattern in
./ocamlorg_web/lib/handler.ml:219:33:    let name_is_s { name; _ } = String.lowercase_ascii name = pattern in
./ocamlorg_web/lib/config.ml:4:9:  match String.lowercase_ascii s with "true" | "1" -> true | _ -> false
./ocamlorg_data/data.ml:115:19:    let pattern = String.lowercase_ascii s in
./ocamlorg_data/data.ml:127:30:           contains pattern (String.lowercase_ascii name))

sagnikc395 avatar Aug 05 '24 16:08 sagnikc395

looking into uucp functions to make those comparisions

sagnikc395 avatar Aug 05 '24 16:08 sagnikc395