datagrepper icon indicating copy to clipboard operation
datagrepper copied to clipboard

Search problem appeared in last 5 days or so

Open bretth25 opened this issue 3 years ago • 4 comments

This query returns no records for the last 5 days - https://apps.fedoraproject.org/datagrepper/v2/search?user=bretth&delta=1000000&topic=org.fedoraproject.prod.kerneltest.upload.new

The same query without the "user=bretth" returns a message from bretth uploaded in the last 2 hours - https://apps.fedoraproject.org/datagrepper/v2/id?id=2022-effc9ab8-5f44-4373-b5a3-c716181d270e&is_raw=true&size=extra-large

The "user=" query has worked fine for ages but now appears to not find recent messages.

I am seeing the same query behaviour on the "org.fedoraproject.prod.fedbadges.badge.award" topic as well

Happy to try to help out, any thoughts on figuring this out? It would be helpful to see the records in the datanommer db - is there another way to do that?

By a quirk of fate, I am a Java coder but am Python literate so I can work at source code level if that helps.

Thanks

bretth25 avatar Jan 22 '22 22:01 bretth25

I just noticed that userid searches on the "org.fedoraproject.prod.bodhi.update.comment" topic appear normal ie find recent records. The userid (bretth) appears in the title line of the records found, the title line for kerneltest records do not contain userid - unsure if that is relevant.

bretth25 avatar Jan 23 '22 22:01 bretth25

Making some progress. I've got a datagrepper VM running, shifted the data source from staging to production, captured some messages and done some snooping in the datanommer database with psql.

It appears the success of the "user=" query is related to there being an entry in the "users" table for that user.

Current working hypthosesis - Bodhi messages seem to create an entry in the user table and kerneltest messages do not.

Still digging...

bretth25 avatar Jan 24 '22 22:01 bretth25

Indeed, I suspect that the kerneltest messages don't have a schema in fedora-messaging, and as a result datanommer does not know how to extract the username from the messages. It would explain why the user table is not populated.

Writing a schema is not difficult for a python programmer, we can add that for kerneltest, but adding the missing schemas has not been very high on our priorities yet. Hopefully this bug report will bump it.

Thanks for your investigation.

abompard avatar Jan 25 '22 08:01 abompard

Thanks for the reply.

I started looking into adding schemas and realised that to test the solution would require changes to the schema in the kerneltest.upload header (and also the schema in the badge.award header - badge award messages have the same problem with user= query).

I've written a quick fix by adding these lines to the usernames() method in fedora_messaging/message.py: if "agent" in self.body: return [self.body["agent"]] if "user" in self.body and isinstance(self.body["user"],dict): return [self.body["user"]["username"]] With that fix, user= queries now work for kerneltest.upload and badge.award messages.

Just thinking, would changing the base message to cover some simple cases save the effort of writing multiple trivial schemas and changing message headers in multiple messages? Or is that too much of a hack?

bretth25 avatar Jan 26 '22 13:01 bretth25