brave-browser icon indicating copy to clipboard operation
brave-browser copied to clipboard

Implement text embedding processing for ad matching MVP

Open moritzhaller opened this issue 2 years ago • 1 comments

moritzhaller avatar Jun 13 '22 12:06 moritzhaller

Removing from the 1.43.x milestone as https://github.com/brave/brave-core/pull/14393 is still opened and hasn't been merged into master yet. We're releasing 1.43.x this week so there's no way this is making it. We only move issues into milestones once they've actually landed in the target milestone and https://github.com/brave/brave-core/pull/14393 is still opened.

kjozwiak avatar Aug 30 '22 04:08 kjozwiak

@ptjames Adding QA/Blocked label as test plan mentions building locally which QA does not do, please provide additional test plan, can work with @btlechowski on this as needed.

LaurenWags avatar Sep 28 '22 20:09 LaurenWags

Verification passed on

Brave 1.45.75 Chromium: 106.0.5249.65 (Official Build) beta (64-bit)
Revision 3269dc3633cdd2ab94546fdbe54962e45b17a6e0-refs/branch-heads/5249@{#580}
OS Ubuntu 18.04 LTS

Verified test plan from https://github.com/brave/brave-core/pull/14393

Verified text embedding is working:

[5051:5051:1009/140553.354039:VERBOSE3:text_embedding_processor.cc(72)] Successfully logged text embedding HTML event
[5051:5051:1009/140553.373045:VERBOSE3:text_embedding_processor.cc(80)] Successfully purged stale text embedding HTML events

Verified embeddings being stored in table text_embedding_html_event image

Verified embeddings are not duplicated for the same id: Before page reload: image After page reload: image

Verified when embedding is not possible, proper message is shown:

[6876:6876:1009/144455.972530:VERBOSE1:text_embedding_processor.cc(53)] No text available for embedding

Logged https://github.com/brave/brave-browser/issues/25878 for failed embedding on wikipedia pages

btlechowski avatar Oct 09 '22 12:10 btlechowski