country-coder icon indicating copy to clipboard operation
country-coder copied to clipboard

Specific prefixes can cause invalid codes to return a valid result

Open Snowysauce opened this issue 7 months ago • 1 comments

This issue is a bit esoteric, so it might take a bit of explaining.

I first noticed that something was amiss when location sets at the NSI with codes like "de-by" (without the .geojson suffix) were being accepted as valid. However, "de-by" is interpreted by country-coder as Belarus ("by"), not Bavaria, "de-bw" as Botswana ("bw"), not Baden-Wurttemburg, and so on.

Further testing and research showed the same behavior for all of the items inside this regex: https://github.com/rapideditor/country-coder/blob/7b90672ea8102fbdf5e829dc2cdc948f7675a754/src/country-coder.ts#L120

In other words, "el-by", "la-by", "of-by", etc., are all recognized as valid codes for Belarus when they should be considered invalid.

This behavior is present for all codes, regardless of their source index (e.g., "el-001", "el-us-ak", etc. all map to the code that follows "el-").

Snowysauce avatar Apr 29 '25 23:04 Snowysauce

I did check and I confirm the query processor does remove those small words.

I'm not sure whether this actually breaks anything in location-conflation or NSI - I feel like if the geojson suffix is present it won't even check country-coder for a code. But I agree it is kind of weird, and maybe we should rethink it.

bhousel avatar Jul 18 '25 20:07 bhousel