validator
validator copied to clipboard
bcp47_language_tag doesn't fail on some non-BCP47 tags
- [x] I have looked at the documentation here first?
- [x] I have looked at the examples provided that may showcase my question here?
Package version eg. v9, v10:
v10
Issue, Question or Enhancement:
When using bcp47_language_tag for validation, some non-BCP47 tags such as "eng" or "en_US" are passing as valid.
isBCP47LanguageTag() uses golang.org/x/text/language's Parse and its documentation says:
[snip] It accepts tags in the BCP 47 format and extensions to this standard defined in https://www.unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers.
Code sample, to showcase or reproduce:
I expect both of these to fail, but they don't:
package main
import (
"fmt"
"github.com/go-playground/validator/v10"
)
func main() {
validate := validator.New()
err := validate.Var("en_US", "bcp47_language_tag")
if err != nil {
fmt.Println(err.Error())
return
}
err = validate.Var("eng", "bcp47_language_tag")
if err != nil {
fmt.Println(err.Error())
return
}
}
I think golang.org/x/text/language's Parse is based on Unicode Locale Data Markup Language (LDML)'s Unicode Language and Locale Identifiers which is based on BCP47 (but they are not strictly the same). E.g., Unicode Language and Locale Identifiers allow the underscore _ to be used as a separator.
sep = [-_] ;
But not BCP47:
langtag = language
["-" script]
["-" region]
*("-" variant)
*("-" extension)
["-" privateuse]
There is a section called BCP 47 Conformance which reads:
It allows certain syntax for backwards compatibility (not BCP 47-compatible):
- The "_" character for field separator characters, as well as the "-" used in