openverse-api icon indicating copy to clipboard operation
openverse-api copied to clipboard

Add additional logging around search_controller's ES query building

Open sarayourfriend opened this issue 2 years ago • 3 comments

Problem

Currently we only log the query itself, but not the process that led to the query being constructed. We are seeing queries get sent with sizes of up to 1800 but have no way of telling why 1800 was chosen for the size.

Description

Add logging to the _get_query_slice method. Log at each branch and be sure to log the variables that lead to each branch as well. The idea is to log as much as is necessary to understand how the method is working in production to produce the from and size results that are being sent to ES.

In addition, log some basic search facts like the term and the serializer data.

Be sure to introduce a "trace" variable so that we can easily follow a particular search request. This can just be a uuidv4 generated at the top of the search method and passed around and added to the logs.

Implementation

  • [ ] 🙋 I would be interested in implementing this feature.

sarayourfriend avatar Jun 28 '22 22:06 sarayourfriend

Fixed by https://github.com/WordPress/openverse-api/pull/777

sarayourfriend avatar Jul 12 '22 18:07 sarayourfriend

Re-opening for consideration, since some of the logging in #777 had to be reverted while triaging the API stability. Does it make sense to try to reintroduce some of the logging before revisiting the search controller refactor?

zackkrida avatar Aug 12 '22 18:08 zackkrida

Agreed, Zack, that we should add the logging back to the search controller.

sarayourfriend avatar Aug 19 '22 13:08 sarayourfriend

@WordPress/openverse-api Does this issue still make sense to implement as it reads today? I'm wondering if we still find value in this particular approach or if we'd want to do something more like labelling specific queries and making it possible to read specific query type times and such. Given we've already identified the most time-intensive parts of the search controller (dead link filtering and very deep pagination), should we add logging around that kind of thing specifically rather than more general logs?

sarayourfriend avatar Nov 21 '22 02:11 sarayourfriend

I think we can lower this issue's priority and look into Kibana, which should have everything built-in to see what is happening inside Elastichsearch. The additional logging around dead link filtering and pagination sounds good to have in the short term as well.

krysal avatar Nov 22 '22 19:11 krysal

which should have everything built-in to see what is happening inside Elastichsearch

Can you elaborate what you mean by this? I'm not familiar with using Kibana in this way. Is there a documentation I could read about it? Is Kibana equipped to do meta-analytics of our ES cluster?

sarayourfriend avatar Nov 24 '22 01:11 sarayourfriend