openverse-api icon indicating copy to clipboard operation
openverse-api copied to clipboard

Wrong message logged from validation of broken links

Open krysal opened this issue 3 years ago • 1 comments

Description

The validation function is assuming every link is from an image so it logs something like this:

openverse-api-web-1  | [2022-05-27 18:30:49,758 - catalog.api.utils.validate_images -  89][INFO] Deleting broken image with ID 4cba47fc-1ba6-43e1-adda-646bf7c4e0ae from results.

But when verifying the identifier in the database is not found in the image table but in audio.

SELECT * FROM image WHERE identifier = '4cba47fc-1ba6-43e1-adda-646bf7c4e0ae';
--- Returns 0 rows

SELECT * FROM audio WHERE identifier = '4cba47fc-1ba6-43e1-adda-646bf7c4e0ae';
--- Returns 1 row

The following utility function should be generalized to apply to more media types (in the immediate case, for audio).

https://github.com/WordPress/openverse-api/blob/a76a6de6a2effd221d3486996a97bcb1370370d7/api/catalog/api/utils/validate_images.py#L13-L17

Reproduction

  1. Can be observed when running the tests for audio
just api-test -k audio_integration
  1. See the logs with
just logs web
  1. Confirm one of the identifiers running the previous queries.

Resolution

  • [ ] 🙋 I would be interested in resolving this bug.

krysal avatar May 27 '22 18:05 krysal

There are two tasks here.

  1. Enhancing log Do we have a one to one mapping between ES index and media types? If yes, we can use ES index name in log for identifying database table.

  2. Refactor validate_images module Refactor validate_image module to include audio files as well. We can use validate_media as a better module name and make it more generic.

ritesh-pandey avatar Aug 01 '22 09:08 ritesh-pandey

Hey @krysal I would like to take up this issue :)

Mishrasubha avatar Dec 02 '22 06:12 Mishrasubha

@ritesh-pandey

Do we have a one to one mapping between ES index and media types?

That's correct, each media has its own ES index, and your suggestion of subdividing into two tasks sounds good.


@Mishrasubha Thanks for showing your interest. Please go ahead!

krysal avatar Dec 02 '22 15:12 krysal