zlint
zlint copied to clipboard
FR: Detect country name in stateOrProvinceName
There have been many cases of misissuance with an invalid stateOrProvinceName
. Having a lint warning for that would help detect them.
Can you suggest something specific here that you think we're missing? What requirements in BRs, RFCs, or root programs are not being checked today? As stated, I think that this issue is too broad to be readily addressable (or for us to even know if there is something to fix in ZLint).
This is not explicitly a requirement of the BRs. I think this is WontFix for now, or at best a request for a WARNING
Level notice?
The context here is the issue around country
explicitly being ISO 3166-1 two letter code, and a desire that stateOrProvince
be interpreted as ISO 3166-2 subdivision. This has been discussed in the Forum before, but the BRs presently offer CAs some flexibility and discretion here (e.g. allowing localized names). That said, that discretion has lead to a lot of invalid values (e.g. placing country names here even when XX
is not used, which is not allowed), so @mimi89999 is highlighting the benefit of some flagging when things deviate. For example, a CA would take a WARNING
level result to require manual review and approval, and could allowlist local representations.
The requirement, FWIW, is 7.1.4.2.2(e) of the current requirements.
Maybe a better solution would be to detect the full country name in stateOrProvinceName
and warn in such case. 7.1.4.2.2 f states:
If thesubject:countryNamefield specifies the ISO 3166‐1 user‐assigned code of XX inaccordance withSection 7.1.4.2.2(g), thesubject:stateOrProvinceNamefieldMAY contain the full name of the Subject’s country information as verified underSection 3.2.2.1.
I understand it as if and only if, so putting the country name stateOrProvinceName
in other cases would not be allowed.
What data source would be used to compare this information to?
They are on Wikipedia and in the ISO norm. That norm is not free, but they kindly provide sources like in https://www.iso.org/obp/ui/#iso:code:3166:PL, so List source, so it could be rebuilt from source.
The PKI Consortium would like to bring the following lint to your attention:
https://github.com/pkic/zlint/blob/state-province/v3/lints/community/lint_subject_dn_state_unknown.go
The initial goal of the lint is to identify incorrect values by combining multiple authoritative data sources. Currently the data is obtained directly from the European Union and combined with data from the ISO 3166-2.
We also created a runner to test the lint against the crt.sh database: https://github.com/pkic/testlint
The systems can support additional sources, including a manual source, but the process for manually adding and reviewing entries has not been defined. The intent is that we have multiple region owners that need to verify the regional data changes with the official government data before the approving.
https://github.com/pkic/regions/
Together with the Universal Postal Union (a United Nations specialized agency) we are discussing common problems, best practices, and to see if we can extend the capabilities into locality, postal codes, and actual addresses where applicable.
The PKI Consortium is comprised of leading organizations that are committed to improve, create, and collaborate on generic, industry or use-case specific policies, procedures, best practices, standards, and tools that advance trust in assets and communication for everyone and everything using Public Key Infrastructure (PKI) as well as the security of the internet in general.
We welcome and encourage anyone that can add value to join us to support projects like these and others: https://pkic.org/join/
Why do all provinces have province
appended to the province name like in https://github.com/pkic/regions/blob/main/data/pl.yaml? Most certificates I saw didn't have province
in the province name.
When you look at the data you can see the following:
# Holy Cross
- codes:
euvoc: http://eurovoc.europa.eu/7971
iso3166-2: PL-26
names:
- name: Holy Cross
sources:
- name: euvoc
languages: [en]
value: Holy Cross province
The codes on the first lines map back to the source of the information.
As you can see in the authoritative data source (http://eurovoc.europa.eu/7971) these values are reported as such by the local governments. We retain the original value under sources
, but you can also see that we normalized the value and removed the province
suffix in the name
field. Currently both values will be threaded as valid.
The data is not even consistent. Some entries have value
with województwo
, others don't have it:
- name: małopolskie
sources:
- name: euvoc
languages: [pl]
value: województwo małopolskie
- name: iso3166-2
languages: [pl]
[...]
- name: Mazowieckie
sources:
- name: euvoc
languages: [el, pl]
- name: iso3166-2
languages: [pl]
I think the last few comments are probably better captured as issues/discussion on the https://github.com/pkic/regions/ repository instead of here. Thanks!
This project is in active development, we are reviewing the data and results, so we really appreciate all feedback.
You always need check it in relation to the authoritative source.
- If a
value
property is included in the source, it's normalized. - If the
value
property is not included in the source section it is exactly listed as in the source.
Let have these discussions about the pkic/regions project/data directly at the repository as @cpu suggested.
I think a basic warning level lint for a country name appearing in the stateOrProvinceName
field is a good starting point here. I don't think, at this stage, that it's feasible to implement a full on regional check for the ST field and it's best left up for the individual CAs to build their own list of acceptable ST values, especially as there is a lot of flexibility in the BRs for this.
A warning here would still be benefical here though, and given there is a list of country names available (even if there isn't a list including every variation of the name, we only need a few common ones to warn here) perhaps it could give CAs a starting point to investigate their ST values further to see if some level of misissuance is occuring.
@FozzieHi For a "warning level lint for a country name appearing in the stateOrProvinceName field", how would you handle name collisions? For example, "Georgia" is both a State in the U.S. and a Country in Europe.
@robstradling The simple way would be to tie the country name in the ST field to the countryName
field in the certificate. So it would only warn if "Georgia" was in the stateOrProvinceName
field and GE was in the countryName
field. I haven't seen many (if any) certificates where there is a country name in the ST field which is different from the country in the countryName
field.