uap-python icon indicating copy to clipboard operation
uap-python copied to clipboard

YaSearchBrowser UA is mis-detected as Chrome — bundled uap-core is outdated

Open GeorgiySurkov opened this issue 6 months ago • 1 comments

Package: ua-parser 1.0.1 (PyPI, released Feb 1 2025)
Issue: UA strings containing the token YaSearchBrowser are incorrectly parsed as Chrome instead of Yandex Browser.


Example

ua = (
    "Mozilla/5.0 (Linux; Android 10; TECNO LC7 Build/QP1A.190711.020) "
    "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.5414.117 "
    "Mobile Safari/537.36 YaApp_Android/9.02 YaSearchBrowser/9.02"
)

from ua_parser import parse
print(parse(ua).user_agent.family)
# → 'Chrome'      ← should be 'Yandex Browser'

Root cause

  • The package still bundles uap-core at commit d668d6c (uap-core v0.18.0, merged Feb 2024).
  • Support for YaSearchBrowser was added later in https://github.com/ua-parser/uap-core/pull/578 (April 2024) along with test coverage.
  • These UA strings now match the generic Chrome regex before reaching the Yandex-specific rule.

Suggested fix

  1. Update the embedded regexes.yaml to the latest uap-core (any commit after PR #578).
  2. Release a patch version (e.g. 1.0.2) to deliver the fix downstream.
  3. Optionally: add the UA string above as a test case to prevent future regressions.

Thanks for maintaining the project!

GeorgiySurkov avatar Jun 16 '25 17:06 GeorgiySurkov

Hello, thank you for the report!

Unbundling the set of precompiled regexes from ua-parser was one of the changes that went into 1.0 (#232). The precompiled regexes are now part of the separate ua-parser-builtins wheel.

And part of the reason to do that was specifically the ability to cut "unreleased" uap-core commits as precompiled. The repository has a monthly cron which creates a new bundle from the current master of uap-core at that point.

Because -builtins tries to match the versioning of the upstream, those intermediate releases are "prereleases", so you need to use pip install --pre in order to access them, by default pip will only install "stable" releases, but that's about it. You can see the intermediate releases in the release history page on pypi: https://pypi.org/project/ua-parser-builtins/#history

If you test with that, you will see that e.g. ua-parser/uap-core#611 test cases are properly extracted. Same with the 4 test cases in ua-parser/uap-core#578

Now I can understand you'd be less than chuffed that your strings are stuck in prerelease, but unless -core has come out and said they can't be arsed to cut releases anymore I'm not super willing to create versions for their projects.


With that being said, the specific ua string you provide here is correctly incorrect (although it's the wrong incorrect, with the latest dataset it reports chrome mobile): the pattern from ua-parser/uap-core#578 is

(YaSearchBrowser)/(\d+)\.(\d+)\.(\d+)

it requires a 3-part version number, but the UA string you posted here only has 2. So unless I missed something (which is very much possible mind), even loading the current HEAD of uap-core is not going to report Yandex Browser.

masklinn avatar Jun 16 '25 19:06 masklinn