grimoirelab
grimoirelab copied to clipboard
studies on git index not working
I have following setup as explained over here.
https://github.com/chaoss/grimoirelab/blob/master/default-grimoirelab-settings/setup.cfg
[git]
# Names for raw and enriched indexes
raw_index = git_grimoirelab-raw
enriched_index = git_grimoirelab
latest-items = true
studies = [enrich_demography:git, enrich_areas_of_code:git, enrich_onion:git]
[enrich_demography:git]
[enrich_areas_of_code:git]
in_index = git_grimoirelab-raw
out_index = git_aoc_grimoirelab_enriched
[enrich_onion:git]
in_index = git_grimoirelab
out_index = git_onion_grimoirelab_enriched
contribs_field = hash
Unfortunately I don't see the out_index being created for both the onion and areas of code. Also demography is not visible in Kibiter dashboards.
I'm running the following docker images.
bitergia/mordred:grimoirelab-0.5.52bitergia/kibiter:community-v6.8.6-3docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.6
Hi @marcofranssen, thanks for showing your interest in GrimoireLab. I see a small mistake in the configs. I think it might be one of the reasons for the issue.
[enrich_onion:git]
- in_index = git_grimoirelab
+ in_index = git
out_index = git_onion_grimoirelab_enriched
contribs_field = hash
Let us know if this solves the issue. :slightly_smiling_face:
@vchrombie I don't get it. In the examples shown on the link that I posted in previous comment in_index of [enrich_union:git] points to the out_index of the [git] section.
I'll try to change it and see if that at least populates the onion part. But I don't see the logic in the examples on how to resolve this for the enrich areas of code section.
Hi @marcofranssen
@vchrombie I don't get it. In the examples shown on the link that I posted in previous comment
in_indexof[enrich_union:git]points to theout_indexof the[git]section.
Sorry but I checked it too, link to the line -> https://github.com/chaoss/grimoirelab/blob/master/default-grimoirelab-settings/setup.cfg#L152
The in_index of [enrich_onion:git] points to git. This index name comes from the aliases.json file.
I'll try to change it and see if that at least populates the onion part. But I don't see the logic in the examples on how to resolve this for the enrich areas of code section.
The areas of code configuration looks fine to me. I don't understand the reason why the out_index is not generated. I will have a closer look.
Thanks.
@vchrombie yup I also just realized, git is the alias for my index git_grimoirelab. Both studies simply don't seem to be executed as there are no indexes created.
I am having a similar issue. Trying to construct an setup.cfg from the example is extremely difficult and the documentation snippets in the https://github.com/chaoss/grimoirelab-sirmordred#setupcfg do not generate a working configuration either. I have declared github:pull, git, and github:repo as described in the extended documentation to setup.cfg and all my charts are empty. Even data sources appears empty, but ES ran and I watched the debug logs iterate PRs, etc.
Hi @marcofranssen,
I guess your git raw and enriched indexes generated well.
Could you check your all.log if the studies have been run or if you see any errors? Take into account that the studies will start when the enrichment process is finished.
I hope it helps you.
Best, Quan
Hi @RBI-AaronKulick,
Trying to construct an setup.cfg from the example is extremely difficult and the documentation snippets in the https://github.com/chaoss/grimoirelab-sirmordred#setupcfg do not generate a working configuration either.
Sorry, we will improve the DOC.
I have declared github:pull, git, and github:repo as described in the extended documentation to setup.cfg and all my charts are empty. Even data sources appears empty, but ES ran and I watched the debug logs iterate PRs, etc.
Check your all.log if you see any errors and could you share your setup.cfg and the projects.json?
Best, Quan
@zhquan I noticed mordred was now in following loop for as far I could scroll back in the logs.
2021-01-26 10:04:27,722 - sirmordred.task_manager - INFO - [Global tasks] sleeping for 100 seconds
2021-01-26 10:06:08,810 - sirmordred.task_projects - INFO - Reading projects data from /home/bitergia/conf/projects.json
2021-01-26 10:06:09,831 - sirmordred.task_identities - INFO - [sortinghat] No changes in file /home/bitergia/conf/organizations.json, organizations won't be loaded
2021-01-26 10:06:09,831 - sirmordred.task_identities - INFO - Loading GrimoireLab identities in SortingHat
2021-01-26 10:06:10,047 - sirmordred.task_identities - INFO - [sortinghat] No changes in file /tmp/tmpu2r64qpf, identities won't be loaded
2021-01-26 10:06:10,048 - sirmordred.task_identities - INFO - [sortinghat] End of loading identities from file /tmp/tmpu2r64qpf
2021-01-26 10:06:28,367 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm email-name
2021-01-26 10:06:29,432 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm email
2021-01-26 10:06:30,483 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm github
2021-01-26 10:06:31,359 - sirmordred.task_identities - INFO - [sortinghat] Executing affiliate
2021-01-26 10:06:45,168 - sirmordred.task_identities - INFO - [sortinghat] Executing autoprofile for sources: ['git', 'github']
2021-01-26 10:06:57,050 - sirmordred.task_identities - INFO - [sortinghat] Autogender not configured. Skipping.
2021-01-26 10:06:57,050 - sirmordred.task_manager - INFO - [Global tasks] sleeping for 100 seconds
2021-01-26 10:08:38,132 - sirmordred.task_projects - INFO - Reading projects data from /home/bitergia/conf/projects.json
2021-01-26 10:08:39,149 - sirmordred.task_identities - INFO - [sortinghat] No changes in file /home/bitergia/conf/organizations.json, organizations won't be loaded
2021-01-26 10:08:39,149 - sirmordred.task_identities - INFO - Loading GrimoireLab identities in SortingHat
2021-01-26 10:08:39,363 - sirmordred.task_identities - INFO - [sortinghat] No changes in file /tmp/tmp2fi540uh, identities won't be loaded
2021-01-26 10:08:39,363 - sirmordred.task_identities - INFO - [sortinghat] End of loading identities from file /tmp/tmp2fi540uh
2021-01-26 10:08:57,782 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm email-name
2021-01-26 10:08:58,857 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm email
2021-01-26 10:08:59,908 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm github
2021-01-26 10:09:00,808 - sirmordred.task_identities - INFO - [sortinghat] Executing affiliate
2021-01-26 10:09:14,909 - sirmordred.task_identities - INFO - [sortinghat] Executing autoprofile for sources: ['git', 'github']
2021-01-26 10:09:26,468 - sirmordred.task_identities - INFO - [sortinghat] Autogender not configured. Skipping.
2021-01-26 10:09:26,468 - sirmordred.task_manager - INFO - [Global tasks] sleeping for 100 seconds
2021-01-26 10:11:07,568 - sirmordred.task_projects - INFO - Reading projects data from /home/bitergia/conf/projects.json
2021-01-26 10:11:08,585 - sirmordred.task_identities - INFO - [sortinghat] No changes in file /home/bitergia/conf/organizations.json, organizations won't be loaded
2021-01-26 10:11:08,585 - sirmordred.task_identities - INFO - Loading GrimoireLab identities in SortingHat
2021-01-26 10:11:08,804 - sirmordred.task_identities - INFO - [sortinghat] No changes in file /tmp/tmpu78jd8n6, identities won't be loaded
2021-01-26 10:11:08,804 - sirmordred.task_identities - INFO - [sortinghat] End of loading identities from file /tmp/tmpu78jd8n6
2021-01-26 10:11:27,295 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm email-name
2021-01-26 10:11:28,358 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm email
2021-01-26 10:11:29,406 - sirmordred.task_identities - INFO - [sortinghat] Unifying identities using algorithm github
2021-01-26 10:11:30,282 - sirmordred.task_identities - INFO - [sortinghat] Executing affiliate
2021-01-26 10:11:44,080 - sirmordred.task_identities - INFO - [sortinghat] Executing autoprofile for sources: ['git', 'github']
2021-01-26 10:11:54,499 - sirmordred.task_identities - INFO - [sortinghat] Autogender not configured. Skipping.
Now I restarted the mordred process it started collecting data again.
2021-01-26 10:28:03,918 - grimoire_elk.elk - INFO - [git] Done collection for https://github.com/my-org/my-repo.git
2021-01-26 10:28:03,919 - sirmordred.task_collection - INFO - [git] collection finished for https://github.com/my-org/my-repo.git
2021-01-26 10:28:03,919 - sirmordred.task_collection - INFO - [git] collection starts for https://github.com/my-org/my-repo.git
2021-01-26 10:28:03,941 - grimoire_elk.raw.elastic - INFO - [git] Incremental from: 2020-09-29 20:13:53+00:00 for https://github.com/my-org/my-repo.git
Now waiting for this part of the process to complete. What should I grep for in my logs to find the studies?
@zhquan Found following logs.
2021-01-26 11:11:05,769 - grimoire_elk.elk - ERROR - [git] Problem executing study enrich_areas_of_code:git, ConnectionError(HTTPConnectionPool(host='elasticsearch', port=9200): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f86e550fc18>: Failed to establish a new connection: [Errno 111] Connection refused'))) caused by: ConnectionError(HTTPConnectionPool(host='elasticsearch', port=9200): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f86e550fc18>: Failed to establish a new connection: [Errno 111] Connection refused')))
2021-01-26 11:11:05,769 - sirmordred.task_manager - ERROR - [git] Exception in Task Manager ConnectionError(HTTPConnectionPool(host='elasticsearch', port=9200): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f86e550fc18>: Failed to establish a new connection: [Errno 111] Connection refused'))) caused by: ConnectionError(HTTPConnectionPool(host='elasticsearch', port=9200): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f86e550fc18>: Failed to establish a new connection: [Errno 111] Connection refused')))
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/urllib3/connection.py", line 159, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw)
File "/usr/local/lib/python3.7/dist-packages/urllib3/util/connection.py", line 80, in create_connection
raise err
File "/usr/local/lib/python3.7/dist-packages/urllib3/util/connection.py", line 70, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
It seems the port is hardcoded in the studies. As my elasticsearch endpoint runs on port 80 at http://elasticsearch via a proxy.
in my credentials.cfg I have the endpoints defined as following.
[es_collection]
url = https://admin:admin@elasticsearch
[es_enrichment]
url = https://admin:admin@elasticsearch
I would expect that studies use this endpoint as opposed to adding a port. I would like to stick with my traefik loadbalanced setup to spread the load on my elasticsearch nodes. A workarround would be to sent everything directly to a single node, but that is not a production like setup.
Hi @marcofranssen
Sorry for the late reply. I lost this issue :(
The port is not hardcoded in the code by default elasticsearch uses the port 9200 https://github.com/chaoss/grimoirelab-elk/blob/master/grimoire_elk/enriched/git.py#L534.
Do you have enriched indexes? If you can create the enriched indexes you will have no issue creating studies indexes https://github.com/chaoss/grimoirelab-sirmordred/blob/master/sirmordred/task_enrich.py#L282
Could you try to set the port directly on the URL as https://admin:admin@elasticsearch:80 and try again?
Best, Quan
Closing this due to no activity.