grimoirelab-elk
grimoirelab-elk copied to clipboard
[Git] `origin` has been changed in `_fix_item` method
https://github.com/chaoss/grimoirelab-elk/blob/efd60e38a100d23979f068ff3ab8131fd88a81f6/grimoire_elk/raw/git.py#L66
We can see that origin would be changed in _fix_item method.
Example, if the origin value of origin is https://xxx:[email protected], it would been changed to https://xxx.com.
However, the uuid is generated from the origin value of origin:
perceval/backend.py#L424
'uuid': uuid(self.origin, self.metadata_id(item)),
And if in the next time, we re-run perceval to get all commits (from-date = 1970-01-01) of the same repo but with different url https://xxx2:[email protected], there would be two docs in ES to store the same commit because of the different uuid.
What I want is that there is only one doc in ES to store the same commit of the unique repo (at least the value of origin after _fix_item).
I think we should try not to change the value of origin in the method _fix_item. But if we must to do that , we need to change the generating rule of uuid.