alegre icon indicating copy to clipboard operation
alegre copied to clipboard

Cv2 5370 media text split

Open DGaffney opened this issue 1 year ago • 2 comments

Description

Small tweaks to make full end-to-end tiplines work locally

Reference: CV2-5370

How has this been tested?

Tested extensively locally and confirmed to work

Have you considered secure coding practices when writing this code?

None

DGaffney avatar Oct 10 '24 16:10 DGaffney

If you can give some more detail in the description about what the changes are intended to do, it would help me with reviewing more quickly. Like it looks there is some simple naming refactoring, plus moving some calls to async blocking, adding paraphrase, and calling a separate endpoint for text based queries (I'm guessing this is the "media text split") in the title. is that right?

skyemeedan avatar Oct 10 '24 19:10 skyemeedan

If you can give some more detail in the description about what the changes are intended to do, it would help me with reviewing more quickly. Like it looks there is some simple naming refactoring, plus moving some calls to async blocking, adding paraphrase, and calling a separate endpoint for text based queries (I'm guessing this is the "media text split") in the title. is that right?

That's basically right - I would summarize this as:

  • Make all env_files consistent with mock s3 variables,
  • Move the sync endpoint for text to use the blocking requests instead of legacy alegre-based fingerprinting,
  • Making some minor refactors in elastic_crud.py so that we return the necessary data when running through pending fingerprinting tasks,
  • Bypassing fields that we should never check against when querying objects (since they are either purely metadata or globally unique),
  • Adding the paraphrase model,
  • Structuring sync requests so that they can be properly immediately searched on,
  • Clean up the response objects in return_sources to dis-embed the keys we need to send back to Check API,
  • Test changes to accommodate all the above

Roughly in order of reading through the diff page.

DGaffney avatar Oct 11 '24 14:10 DGaffney

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='aos-0e790625b530-irilyx4... api.similarity_similarity_resource View Issue

Did you find this useful? React with a 👍 or 👎

sentry[bot] avatar Oct 18 '24 19:10 sentry[bot]