nbgitpuller
nbgitpuller copied to clipboard
Support pulling from a private repo
nbgitpuller
is presumably intended for use with public repositories, but is there also a way of pulling files down from a private repo?
Presumably this would require a key adding to the repo URL, ideally set from a read-only account on the repo?
You could add the token to the URL and edit how to git URL is constructed so it is taken into account. This would work if you trust your users because they'd be able to see the token.
Another way is to run a git proxy as part of your deployment. That proxy knows the secret and periodically fetches the upstream repository. Users pull from the proxy without needing authentication.
https://github.com/berkeley-dsep-infra/data8xhub/tree/b234e2da2838c40597133f7ae60546d361477d51/hub/templates/reposync is an example of setting up a git proxy. There the use case is allowing users to update material without having to allow them to access the internet. You could probably repurpose this for private repo access.
Thanks; the token baked into the URL route is the one I think we'd go for, because the end users will be trusted insofar as you can ever trust students! Just means we have to make sure we set permissions correctly on the other end.
(There's nothing we really need to keep secret in the repo, it's just that the policy is currently to use a private one :-( )
Hi I am having the same issue that was solved above. However I don't understand how to implement the same solution. Specifically I don't understand how to incorporate the git token into the puller URL. Can you provide some guidance on the structure of the URL and what type of token to use?
@psychemedia or @betatim do you think you could describe in a bit more detail how to do this? We should add a section to the docs as I think this is probably a fairly common pattern
https://blog.github.com/2012-09-21-easier-builds-and-deployments-using-git-over-https-and-oauth/ is probably a good place to start to learn about access tokens and incorporating them into nbgitpuller URLs.
You end up with a URL like https://hubme.com/hub/user-redirect/git-pull?repo=http%3A%2F%2FSECRETTOKENHERE%3Ax-auth-token%40github.com%2Ftim%2Ftim.git&branch=master&subPath=examples&app=notebook
The problem with this approach is that your students can see the token so you can only use this if you trust them or if you use the nbgitpuller
executable in a lifecycle hook when your deployment is like the zero2jupyterhub one.
@betatim Thanks for the tips. To expand a bit on what I had to do to get the nbgitpuller link to work:
-
Create a Personal Access Token following these directions and copy the key to some place safe
-
Use the nbgitpuller constructor to help build the URL
-
In the Repo_url field add your
GitHubUserName : SecretToken @
between the http:// and your git branch URL(no spaces, I just added those for emphasis. Don't forget the .git at the end of the URL path
Re-opening, since I think we should support private repositories in ways that don't require folks to share their personal access token (equivalent to a password) with others.
I've opened https://github.com/jupyterhub/nbgitpuller/issues/85 to block supporting URLs with personal access tokens, since that is extremely dangerous. However, we should do that only after providing a way to access private repositories easily.
I did some experiences to try to workaround this issue and what I did is:
- configure git to cache credentials inside the terminal and do a fetch or clone to initialize the cache:
git config credential.helper 'cache --timeout=120'
git config --global credential.https://mysite.ext.username eric.leblouch
git clone https://mysite.ext/myrepo
- Then I could click on the nbgitpuller link and have it clone my private repo without needing to put any username or password in the link.
To correct this issue, what do you think would be sufficient ?
- HTTPS access with the possibility to enter username and password if needed
- SSH configuration in the terminal and then direct access (with the possibility to enter the passphrase if needed)
Seems complex to implement a callback here to ask for Username first and then for Password in order to pass them to the living git subprocess.
What I get from clone or fetch is:
- Enter passphrase for key 'my_key':
- Username for 'https://mysite.ext':
- Password for 'https://mysite.ext':
They all end with a colon and need an input.
It seems to me that adding a phase 'input' and passing the line ending with the colon as a message could be a way to make the server ask for the input to index.js. But without a web socket or a supplementary calls to the api, I do not understand how to send the input back to the server.
Have I missed something here ?
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/jupyterhub-not-able-to-connect-to-internal-repositories/7487/1
I did not find any issue as being raised for this issue. If you see anything, could you provide me the link
I've got it working with github.com/yuvipanda/git-credential-helpers. There's some helpful config here, here. https://github.com/utoronto-2i2c/jupyterhub-deploy#pulling-from-private-github-repos-with-nbgitpuller describes the final workflow.
Would love for someone to write this up! I'll try and find some time in the next month for it...
@yuvipanda just wondering what the status is on this
I would be happy to help with this write-up but I don't understand the GitHub App aspect. Is the source of this available somewhere?
We use private GitHub repositories for assessments (using nbgrader
) so our overall use case is more complicated than this, but we might be able to help move this in the right direction
@hoffm386, I just came across this discussion. Thanks to @yuvipanda's "git-credential-helpers" repo and a new GitHub app I created that has no source code, I was able to pull from a private repo just fine. Really, nothing special with the GitHub app, just fill in the required info and allow read access to code, that's it.
Is the DIY https://github.com/yuvipanda/git-credential-helpers approach still the best way to support pulling from private repos, or have things moved on in best practice one-way distribution of private repo files where you don't want the user to have any write permissions?
For reference, we store our repos in GCP (so they are all private) and to access them we slipped an unofficial Google credential helper in our singleuser
image using:
go install -v github.com/google/googlesource-auth-tools/git-credential-googlesource@latest
strip $(go env GOPATH)/bin/git-credential-googlesource
N.B. you can instead use gcloud
(usual GCP CLI tools) but it adds ~1GB of extra stuff to the image, this Go binary is only ~10MB
We then configured git
to use it with:
git config --system init.defaultBranch main
git config --system google.account application-default
git config --system credential.'https://source.developers.google.com'.helper '!git-credential-googlesource'
Then it is just a case of arranging for the instance credentials to be able to access your GCP Source Repositories (using the role roles/source.reader
).