grimoirelab-elk
grimoirelab-elk copied to clipboard
Update bugzilla csv to latest format
Following https://github.com/chaoss/grimoirelab-elk/issues/803, the goal of this issue is to update the schema of Bugzilla: https://github.com/chaoss/grimoirelab-elk/blob/master/schema/bugzilla.csv
Micro-mordred[*] should be executed to collect and enrich the bugzilla data. Then, the enriched documents should be inspected using the dev tools or the discover of Kibiter. For each attribute found in the enriched index, the corresponding schema should contain the name of the attribute, the type, whether the field can be aggregated and a description.
Note that some fields like the grimoire_creation_date, project, project_1, origin, etc. are shared across all enriched indexes and their descriptions can be taken from existing schemas (e.g., https://github.com/chaoss/grimoirelab-elk/blob/master/schema/git.csv).
[*] Details to execute micro-mordred for a given data source are available at: https://github.com/chaoss/grimoirelab-sirmordred#supported-data-sources
Hi @valeriocos I think this is already a duplicate of the same issue. I mean Bugzilla was already mentioned in issue #803. Let me know if there is anything specific reason for opening this which I might have missed. Thanks.
EDIT Anyways, never mind. We can have it for discussing if anyone has any doubts. :smile:
Hi @vchrombie, your right! Bugzilla was listed in #803. However I thought that issue was too generic. Sorry for the misunderstanding!
If you are on it, we can close this issue (or leave it open if this is convenient for you). As you prefer :)
Hi @vchrombie, your right! Bugzilla was listed in #803. However I thought that issue was too generic. Sorry for the misunderstanding!
If you are on it, we can close this issue (or leave it open if this is convenient for you). As you prefer :)
It is fine, we can leave it like this. Not a big deal. :slightly_smiling_face: Thanks.
Also, @sanjana091001 would you like to help with this issue? I can help you if you have any doubts.
Also, @sanjana091001 would you like to help with this issue?
That's perfect! Thanks!
Also, @sanjana091001 would you like to help with this issue? I can help you if you have any doubts.
yes, I'd like to work with this. Thanks @vchrombie :)
Hello, I have finished setting up Grimoirelab, and have started working on this issue .
Great, thank you for working on this @sanjana091001
Hi @valeriocos @vchrombie. I had created a script yesterday to generate the schema for Pagure. It's based on the same logic as the script of microtask-8. I have modified it a bit to support any index. Let me know it helps with this work :)
Here is the link: https://gist.github.com/animeshk08/0a8dafa66826137032efb6c771074d1d
NOTE: The only default field right now is UUID. I was unsure which ones to add. I can add all the fields after a bit of discussion :)
Hi @animeshk08 thank you for sharing the script. It could be useful to speed up the creation of the missing/not up-to-date CSV files. @vchrombie @sanjana091001 WDYT? Can you give it a try for the CSVs you are working on?
Thanks!
Sorry @animeshk08 @valeriocos, I missed the messages. Thanks for the suggestion. I will try the script on meetup or stackoverflow and let you know how it goes.
I just want to reference another PR (#847) as it will affect if the fork is not updated.
Hi @animeshk08
Here is the link: https://gist.github.com/animeshk08/0a8dafa66826137032efb6c771074d1d
I was working on the gitlabcomments enricher and tried this script to create the schema for that. It worked great. Thanks for sharing the script. :rocket: :smile: