sigridci icon indicating copy to clipboard operation
sigridci copied to clipboard

Instructions for bulk-uploading projects to Sigrid.

Open dennis-sig opened this issue 2 years ago • 1 comments

dennis-sig avatar Jul 04 '22 15:07 dennis-sig

Feedback from Marco:

  • Multiple repos can have the same name over multiple groups, therefore we had to hack our way around that using prefixes with the group name for the system name
  • Groups have subgroups, therefore this further complicates the system name story, possibly leading to the same issue with too-long system names we had earlier. (now fixed)
  • Ghorg does not support in input a list of orgs (groups), or a list of repos per org, just fetches everything. This is a problem as they wanted a subset of repos per each org (96 orgs)
  • We created a quick and dirty script to process all their repos fetching from a list, I will ask our contact if he can share these scripts with us so that we have some point where to start from if we want to do this with other customers.
  • The pros of writing our own tool is that we can detect system names that are the same across different (sub)groups, and handle them accordingly with some sort of strategy when onboarding. Maybe configurable by the client/consultant.
  • The cons is that clearly then it becomes our burden, as I hope it’ll be used a million times from now onwards. Also, we should consider cases also for at least github as other platform the script should support.

dennis-sig avatar Aug 22 '22 10:08 dennis-sig

I tackled this goal differently:

  • All in Python
  • Use pygithub to request all repo's
  • Handle all repo's sequentially in a for-loop:
    • Check if the code is relevant based on GitHub repo object properties using custom conditions
    • Git checkout into a temporary directory
    • Determine metadata from repo contents and generate a sigrid-friendly system name
    • Upload by calling the sigrid.py from a subprocess
  • I optimized it to check how current the code in Sigrid is, and to skip if a recent upload already took place

Compared to the approach in this PR it allows you to inject logic to select the relevant code and can prevent unnecessary uploads.

The sequential setup makes it a bit slow, but that comes with the benefit of reducing the load on the side of Sigrid.

nicorikken avatar Jan 26 '23 13:01 nicorikken

Closing this pull request because people are now able to figure this out without the need for special documentation.

dennis-sig avatar Oct 18 '23 12:10 dennis-sig