faker icon indicating copy to clipboard operation
faker copied to clipboard

refactor(location): replace zipCode format parameter with style

Open ST-DDT opened this issue 1 year ago • 15 comments

First part of #2390

  • #2390

Deprecates the format parameter in favor of using the native locale patterns or using faker.helpers.replaceSymbols directly.

ST-DDT avatar Oct 26 '24 11:10 ST-DDT

Deploy Preview for fakerjs ready!

Name Link
Latest commit dc2089ecb6fb5c075d31ea38fda94a2aa2af76b3
Latest deploy log https://app.netlify.com/sites/fakerjs/deploys/67715b753043530008cb08a7
Deploy Preview https://deploy-preview-3223.fakerjs.dev
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

netlify[bot] avatar Oct 26 '24 11:10 netlify[bot]

Codecov Report

Attention: Patch coverage is 94.11765% with 1 line in your changes missing coverage. Please review.

Project coverage is 99.97%. Comparing base (eceb17d) to head (dc2089e).

Files with missing lines Patch % Lines
src/modules/location/index.ts 94.11% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             next    #3223      +/-   ##
==========================================
- Coverage   99.97%   99.97%   -0.01%     
==========================================
  Files        2811     2811              
  Lines      217017   217029      +12     
  Branches      942      943       +1     
==========================================
+ Hits       216966   216977      +11     
- Misses         51       52       +1     
Files with missing lines Coverage Δ
src/modules/location/index.ts 99.57% <94.11%> (-0.43%) :arrow_down:

codecov[bot] avatar Oct 26 '24 11:10 codecov[bot]

Hmm

Given that the default en locale uses the patterns #####, #####-#### I would imagine that a very common case would be forcing 5 digit ZIPs using format:"#####"

We might want to provide better guidance for this scenario.

matthewmayer avatar Oct 30 '24 13:10 matthewmayer

Hmm

Given that the default en locale uses the patterns #####, #####-#### I would imagine that a very common case would be forcing 5 digit ZIPs using format:"#####"

We might want to provide better guidance for this scenario.

I'm not sure what you are asking for here. Should I add an explicit example for zipCode('#####') to replaceSymbols('#####') or what? IMO that is sufficiently covered by the deprecation message already:

https://github.com/faker-js/faker/blob/ebd0dd04a6055c8878f82c565d6ad13aa34ddd40/src/modules/location/index.ts#L55

I could change the runtime message to include the actual format parameter though:

- 'faker.location.zipCode(), faker.location.zipCode({ state }), or faker.helpers.replaceSymbols(format)',
+ `faker.location.zipCode(), faker.location.zipCode({ state }), or faker.helpers.replaceSymbols('${format}')`,

I hope the main usecase for that method is without the format parameter using the patterns provided by the locale.

Here are some calls using the format parameter.

  • https://github.com/search?q=%22faker.location.zipCode%28%27%22&type=code (175)
  • https://github.com/search?q=%22faker.location.zipCode%28%5C%22%22&type=code (62)

And without format parameter

  • https://github.com/search?q=%22faker.location.zipCode%28%29%22&type=code (1.7k)
  • https://github.com/search?q=%22faker.location.zipCode%28%7B%22&type=code (42)

ST-DDT avatar Oct 30 '24 13:10 ST-DDT

I think my issue is that zipCode("#####") is expressive in a way that none of the suggested replacements are

For example if I have code like

const address = [faker.location.streetAddress(), faker.location.city(), faker.location.state(), faker.location.zipCode("#####")].join("\n")

I feel to someone reading the code that's more intuitive than a call to say faker.string.numeric()

I think "I only want 5 digit zip codes" should be something the zip code method can handle.

matthewmayer avatar Nov 03 '24 09:11 matthewmayer

One of the reasons I don't like that pattern is that it focuses on the wrong issue. If you want to have a five digit postcode, why don't you use the one provided by your locale? Does it have more than one variant? aka one short format and one long format? Isn't the actual issue that there is no "format"/variant option to select between the two?

ST-DDT avatar Nov 03 '24 18:11 ST-DDT

Yes agreed. Having an option where ZIP+4 is only used if an extended:true parameter is used might be good.

ZIP+4 is rarely used in practice in the US.

matthewmayer avatar Nov 03 '24 23:11 matthewmayer

Yes agreed. Having an option where ZIP+4 is only used if an extended:true parameter is used might be good.

ZIP+4 is rarely used in practice in the US.

Unfortunately, I have no clue how the US zip code system works. Other than speeding up international mail delivery (SC) by a week/50% if I use a 9(?) digit zip code instead of the usual 5(?).

Does other countries have similar systems?

ST-DDT avatar Nov 04 '24 08:11 ST-DDT

ChatGPT says:

Most Americans do not know their ZIP+4 code from memory and typically only use the standard five-digit ZIP code in everyday contexts. The ZIP+4 code system is primarily used by the USPS and businesses involved in bulk mailing or shipping, where accuracy and speed are essential. Here’s why ZIP+4 codes are not widely known or commonly used by the general public:

1.	Limited Necessity: For most personal mail and packages, the five-digit ZIP code is sufficient. USPS delivers accurately without requiring the extra four digits.
2.	Lack of Familiarity: ZIP+4 codes aren’t widely taught, and they’re not typically requested for most purposes, like filling out forms, so people rarely commit them to memory.
3.	Complexity: ZIP+4 codes can be specific to buildings, floors, or even particular departments within large companies or institutions. For individuals at residential addresses, the ZIP+4 may not be consistently applied or is often only referenced in specific settings.
4.	Automation: Online platforms and shipping services that benefit from ZIP+4 often auto-generate or auto-fill it during the address verification process. This eliminates the need for individuals to know or use their ZIP+4 manually.

In short, unless a ZIP+4 code is directly needed for business or government purposes, most people only remember and use the five-digit ZIP.

matthewmayer avatar Nov 04 '24 10:11 matthewmayer

Do you know any other country that has multiple variants of zip codes?

ST-DDT avatar Nov 04 '24 10:11 ST-DDT

Not really. In the UK for example postcodes are like "CB1 1DQ" and in some cases the first part "CB1" might be used to refer to an area like "central Cambridge ". But the default postcode is definitely the long/full one.

matthewmayer avatar Nov 04 '24 11:11 matthewmayer

Would 'short' | 'long' be sufficient to distinguish between them?

I'll lookup our other locale's zip patterns for this as well.

ST-DDT avatar Nov 04 '24 12:11 ST-DDT

The question is what is the default. For US zip we want short to be default. But in theory what if there's an another locale where the regular zip code is the long version but there's a short version which is a useful alternative.

matthewmayer avatar Nov 04 '24 13:11 matthewmayer

The question is what is the default.

How about: The default is a random one. That way you can check how your system behaves with unexpected values. If you want a specific format you will look up the options and then pick one that you assume does the thing you need.

ST-DDT avatar Nov 04 '24 13:11 ST-DDT

Team Decision

  1. We will deprecate the format parameter in favor of a style: 'short' | 'long' parameters.
  2. We noticed that user facing zipCode API uses postcode locale data internally. We considered renaming the method, but at the current time we will not implement that. We will discuss this again in the next team meeting
  3. We would like to merge postcode and postcode_by_state into a single object postcode: Record<LiteralUnion<'national'/'*'>, PostCodeDefinitions> + change from replaceSymbols to fake patterns (This will be done in a later PR to reduce visual diff in this PR)

The postcode definition might look like this:

type PostCodeDefinitions =
  | {
      pattern: string[];
      short_pattern?: never;
      long_pattern?: never;
    }
  | {
      pattern?: never;
      short_pattern: string[];
      long_pattern: string[];
    };

If the style parameter is specified, but the current locale only supports one format, then the parameter is ignored. If the style parameter is not specified and the current locale uses multiple formats, then a random one will be picked.

ST-DDT avatar Nov 28 '24 20:11 ST-DDT