DotCi icon indicating copy to clipboard operation
DotCi copied to clipboard

Jobs should be recovered from mongodb

Open ppg opened this issue 9 years ago • 1 comments

I think there's a couple things at play here, but basically if I loose my Jenkins master but I bring up a new one and connect DotCi to the MongoDB where it was storing all its info, I think that database has enough info to automatically recreate the jobs. I believe this because currently if you try to add back a job for a private repo that is in the DB then you get an exception complaining about that job already existing. At that point, its reconfigured the org and you can see all your jobs again. However the one that you just added now has multiple deploy keys for the private repo, which causes it to be marked as not private, and thus the job can't pull any code.

Repo steps:

  1. Setup docker build: docker-compose build docker-compose run --rm plugin npm run build docker-compose up plugin

  2. Visit http://localhost:8080/jenkins and configure DotCi with a test GitHub application

  3. Add a job for a private repository (don't really care if it works or not, but a green one would be better)

  4. Stop your docker-compose up plugin command

  5. Remove ./work folder, which should clear out all the saved state about for Jenkins

  6. Start up jenkins again with docker-compose up plugin

  7. Notice that there are no jobs present; if you want to see the DB state you can docker-compose run --rm mongo mongo --host mongo and then

    use dotci
    db.deploy_keys.find({}, {repo_url: 1})
    

    That will show you there is an existing key for the job

  8. Try to add the same job from step 3

  9. Notice that you get an exception, which is bad; should detect this and handle gracefully.

  10. Go back to root and that repo's organization should be there now and you can navigate to the job; i.e. it was added despite the exception.

  11. Go to the job and try a build; it should fail with a public key denied reason.

  12. Try step 7 again, you'll notice there are now two keys; if you can figure out which one is right you can delete the other one (or modify the repo_url so its not found) and then the build will work again.

I think this illustrates a few issues:

  1. On step 6/7 DotCi should be able to tell that it has jobs configured and automatically make all those jobs be present again; that would actually obviate all the issues in the following steps, although I think there's a benefit to making them robust with the rest of these items.
  2. On step 8/9 this scenario should be detected and handled gracefully; i.e. look for the job earlier (hopefully before adding a second deploy key) and if it exists do the work to recreate the file structure to see the jobs and then redirect them to the job.
  3. On step 11, this code is kinda the issue: src/main/java/com/groupon/jenkins/github/services/GithubDeployKeyRepository.java although I also think this code src/main/java/com/groupon/jenkins/github/services/GithubRepositoryService.java is either not working or not called correctly too. Eitherway before adding deploy keys we have a handle to the Github API, which not go through the existing keys and see if any work first; if they don't then delete them, if they do then use that one. Basically this area can be more robust.
  4. This is more of a suggestion and its really specific to debugging this kinda, but I think since the keys are encrypted in the DB it'd be nice to calculate the fingerprint and store that so that one could correlate them to what's in Github easily.

ppg avatar Aug 06 '15 03:08 ppg

hi @ppg, Let me research this over the weekend.

suryagaddipati avatar Aug 06 '15 20:08 suryagaddipati