deno_std icon indicating copy to clipboard operation
deno_std copied to clipboard

Use of locale-sensitive methods with `undefined` locale may cause environment-sensitive bugs

Open lionel-rowe opened this issue 1 year ago • 1 comments

Describe the bug

I've mentioned this before tangentially in a couple of issues (ex), but it's probably worth its own issue.

Use of locale-sensitive methods, such as toLocaleLowerCase, with an unspecified (undefined) locale may cause environment-sensitive bugs, because the default locale is host defined.

Steps to Reproduce

For example, in http/cookie.ts, which is marked browser-compatible:

https://github.com/denoland/std/blob/6a4eb6cb9185b0b9e0816a227c04b5bf039b99ef/http/cookie.ts#L361

  1. Open Firefox
  2. Go to https://codepen.io/lionel-rowe/pen/rNEXmoG?editors=0010:
    import { getSetCookies } from 'https://esm.sh/jsr/@std/[email protected]/cookie'
    
    const setCookie = 'a=b; EXPIRES=Thu, 19 Sep 2024 07:47:28 GMT'
    const headers = new Headers({ 'set-cookie': setCookie })
    const cookie = getSetCookies(headers)[0]
    
    document.querySelector('#target').textContent = JSON.stringify(cookie, null, 4)
    
  3. Observe expected output (assuming your locale isn't Turkish):
    {
        "name": "a",
        "value": "b",
        "expires": "2024-09-19T07:47:28.000Z"
    }
    
  4. Go to about:preferences
  5. Set language to Turkish ("Türkçe"; Turkish for "language" is "dil" if you need to find the setting again)
  6. Go back to or refresh https://codepen.io/lionel-rowe/pen/rNEXmoG?editors=0010 page
  7. Observe output:
    {
        "name": "a",
        "value": "b",
        "unparsed": [
            "EXPIRES=Thu, 19 Sep 2024 07:47:28 GMT"
        ]
    }
    

Expected behavior

Consistent behavior in all locales

Environment


  • [x] http/cookie
  • [x] http/negotiation
  • [ ] text

lionel-rowe avatar Sep 19 '24 08:09 lionel-rowe

As far as I can see, the problematic uses are toLocaleLowerCase and toLocaleUpperCase in http/cookie.ts, http/_negotiation/encoding.ts, and in the various casing functions under text. By contrast, fmt/bytes.ts is a best practice for allowing opt-in locale sniffing:

https://github.com/denoland/std/blob/6a4eb6cb9185b0b9e0816a227c04b5bf039b99ef/fmt/bytes.ts#L193-L195

and testing it with minimal hard-coding:

https://github.com/denoland/std/blob/6a4eb6cb9185b0b9e0816a227c04b5bf039b99ef/fmt/bytes_test.ts#L9-L10

https://github.com/denoland/std/blob/6a4eb6cb9185b0b9e0816a227c04b5bf039b99ef/fmt/bytes_test.ts#L75-L82

lionel-rowe avatar Sep 19 '24 09:09 lionel-rowe

Can this be closed now?

kt3k avatar Nov 22 '24 06:11 kt3k

Can this be closed now?

I opened https://github.com/denoland/std/pull/6204 with the minimal fix for text, which is just replacing all occurrences of toLocaleXCase with toXCase. Maybe in future an option could be added to specify a locale and/or opt in to host locale sensitivity, if there's demand for it.

lionel-rowe avatar Nov 22 '24 07:11 lionel-rowe