dependabot-core
dependabot-core copied to clipboard
Private hex repo support
This PR addresses https://github.com/dependabot/dependabot-core/issues/1286
Underneath, this required pretty minimal changes to the existing hex updater/checker code. Figuring out how to make the modifications and recognizing that all of the other tests made actual HTTP calls took most of the time.
Currently, it automatically fetches the public_key
and doesn't perform any verification against a fingerprint. I opted out of adding public_key
support initially because most repo consumers don't have direct access to the public_key
anyhow, and they'd have to fetch it the same way. It would be trivial to accept a public_key
option or a fingerprint
for verification.
A minimal server that has two versions of the jason
package is now hosted on a free fly.io instance at dependabot-private.fly.dev. The hex server's source is here: https://github.com/sorentwo/dependabot-private-repo. (I'm happy to transfer the repository or the instance over to dependabot.)
Hi @sorentwo, thank you for this contribution, for including tests, and for linking the hex server source code and making it public! π
I'm not an elixir expert, but from what I can tell these changes look okay. I posted some questions that will hopefully make this easier to review.
I wonder if there's a package for verifying the integrity of public keys?
Since the changes and the Hex server source code check out, I'm going to run CI for this PR. We should be able to get back to you on making changes/merging this later in the week β
Thanks for taking a look!
I wonder if there's a Hex package for verifying the integrity of public keys?
There are libraries in erlang for doing the verification. I'll add another couple of scenarios to cover verifying the public key.
@Nishnha Ok, changes made π The last three commits are reasonably descriptive and atomic. The biggest changes are in the last commit and I'll leave some comments inline.
@Nishnha The rubocop errors are all fixed up now and CI should pass on the next run.
@sorentwo Dependabot also has a separate step for updating lockfiles: https://github.com/dependabot/dependabot-core/blob/main/hex/lib/dependabot/hex/file_updater/lockfile_updater.rb
it uses the same organization_credentials
as the check_update
and get_latest_resolvable_versions
methods did in the version resolver, and would probably need access to hex_credentials
to properly support private repos.
I'm not sure where #update_lockfile_content
itself is called from, but you might be able to just copy the run.exs
file a la https://github.com/dependabot/dependabot-core/blob/main/hex/lib/dependabot/hex/file_updater/lockfile_updater.rb#L25 and swap out the organization_credentials
for hex_credentials
and have it all work.
I am also looking into any additional plumbing that we might need to flesh out internally to pass private hex repo credentials in from dependabot.yml
following the schema in https://github.com/dependabot/dependabot-core/issues/1286#issuecomment-905696313
@Nishnha Great catch! I've made the suggested change, with a light refactoring to share the credential extraction logic.
Let me know if there's anything else I can do to support internal plumbing changes π
@sorentwo thanks for your hard work! I took a look at the PR, just to try and get a sense of what you're doing, and everything looks great
@Nishnha I hope you don't mind the ping, but I wanted to bump this up to you! It's currently blocking my use of dependabot in our Elixir application :(
Hi @sorentwo and @cam-carter ππΎ
We discussed prioritizing this issue during our team's sprint planning meeting today.
Unfortunately, we don't have the capacity for dedicated engineers to work on the internal-half of this issue this quarter. However, it is being tracked in our backlog, so we may get to it in-between planned project work.
The internal work mostly consists of writing a parser for the private repo schema that we would add to dependabot.yml
and adding documentation on the feature.
The Elixir/Erlang/Hex ecosystem has been growing in importance across the GitHub security supply chain, so we hope to prioritize this--alongside other Hex ecosystem features--as planned project work on our roadmap, but we can't really provide a solid time estimate for when that might be.
Really hoping this one can be merged soon! π€
Thanks @sorentwo! Glad to see this being handled π
Thank you as well @sorentwo, looking forward to seeing this PR merged soon! :smiley:
Hi @Nishnha, I hope you don't mind the ping! I was wondering if you might have a better idea on the timeline for this issue and the internal work that would be necessary by now? π
@SophieDeBenedetto as an Elixirist inside of GitHub, any chance you could offer the dependabot team any insight into the internal work necessary to support this? Can the community be leveraged to assist if the dependabot team doesn't have any bandwidth to support what is necessary to get this working?
hello, anything new about this PR please?
@SophieDeBenedetto as an Elixirist inside of GitHub, any chance you could offer the dependabot team any insight into the internal work necessary to support this? Can the community be leveraged to assist if the dependabot team doesn't have any bandwidth to support what is necessary to get this working?
π hey @gjastrab! Sorry I completely missed this over the summer.
any chance you could offer the dependabot team any insight into the internal work necessary to support this
I'm not sure how to answer this question right now. It seems like you have a working solution in this PR so could you elaborate on what is left to be done that could benefit from additional support?
Can the community be leveraged to assist if the dependabot team doesn't have any bandwidth to support what is necessary to get this working
I'd be happy to help with this if we can flesh out what is needed a bit. One option would be to identify a working group through the Erlang Ecosystem Foundation, which I sit on the board of, and appeal for volunteers there. Another option might be identify some folks willing to take this on and apply for funding through the foundation. Otherwise, I can help put the word out for maintainers more generally around the community.
I'd be happy to set up a time to chat if you think that would be helpful. Let me know!
I'm not sure how to answer this question right now. It seems like you have a working solution in this PR so could you elaborate on what is left to be done that could benefit from additional support?
π Hi @SophieDeBenedetto! Iβm the author of this branch. We do indeed have a working solution for private hex packages. As I understand it, the dependabot team needs to update the UI to support the new private hex options and that is all closed source work internal to GitHub.
Thereβs no need for funding! Merging private hex repo support is important for Oban Web+Pro, and Iβm happy to assist anywhere else I can. If there are any changes, docs, or guidance I can offer please let me know.
Hi @SophieDeBenedetto no worries on missing the message, hope you had a great summer! π΄ π
My call for attention was due to this comment above
Unfortunately, we don't have the capacity for dedicated engineers to work on the internal-half of this issue this quarter.
So what I was getting at with my
Can the community be leveraged to assist
ask was can any of the work be delegated to the community (not an ask to fund an external working group), if the blocker to this is a matter of no internal GH resource being available to pursue this work. As @sorentwo mentioned above I'd also be happy to assist if possible.
I think it's fair to say that Oban has been rapidly growing as a stable pillar within the Elixir community, so to have companies support the project w/ purchasing Oban Pro only to then have dependabot broken for Elixir projects in GitHub is quite unfortunate.
If there's anyway the problem can be inverted so it's not blocked by internal GH resourcing that would be amazing!
Supporting private hex repos would be hugely beneficial. We were really enjoying using dependabot at my company, but when we recently adopted Oban Pro, a few weeks later we realized dependabot was broken.
Unfortunately, it seems like there's no workaround if one of the dependencies is hosted on a private hex repo and Oban Pro is very popular, so this is affecting lots of customers. Dependabot has been a valuable tool to keep our dependencies up-to-date, so we'd love if this fix could be prioritized so we don't have to go back to manually monitoring our dependencies like we did pre-dependabot.
Thanks for building such a great tool and please let us know if there's anything we can do to help!
Hi again @sorentwo and @gjastrab, thanks for clarifying, I now understand that this PR is blocked by internal work that has yet to be sourced by GitHub's dependabot team. I don't think there's anything that the external community can do to move that work forward at this time, and I'm not aware of the dependabot team's priorities at this time. I'll try reaching out to see if that team is able to share any further info with me.
Been tracking this for about a year now since I first brought it up to @sorentwo that using Oban Pro broke our ability to use Dependabot. I definitely understand there are internal resourcing priorities, but this is crucial to the community and it would super welcome to see GitHub recognize that and push this forward. The community stepped up to do the open source side of this work. Here's hoping GitHub will step up to do its part and we can get Dependabot working again for Elixir apps that use private Hex repos!
Couldn't agree more with @lleger!
Agreed! @SophieDeBenedetto can you see if the Dependabot team can prioritize this? It looks like most of the work has already been done.
We switched to Depfu solely because of this. Would be great to see this addressed so we can consider dependabot again.
π Hello from the Dependabot team! We're aware of this request, but haven't been able to dedicate engineering hours to it amidst a lot of other, long-standing, improvements to Dependabot the service.
We're going to figure out what needs to be done for this on our end tomorrow. I'm hesitant to share the results because I can't make any commitments, but I'll strive for some level of transparency. We do want to improve support for this community.
Separately, I want to ask that @SophieDeBenedetto not be pinged about this or other Dependabot issues π It's an awkward position to be put in. She's been an advocate for the Elixir/Erlang community and has escalated concerns to our team. However, getting this across the finish line is solely on the Dependabot team.
π @jurre and I paired for a bit to determine what needs to be done to make this change a reality. We've identified some steps that would need to be taken, but we need some more information before we can say how much work this will actually be.
πΌοΈ Context
We believe a couple of changes will need to be made here. I'll list those below. First, it might help to give some context on how Dependabot makes authenticated requests in GitHub's production environments. We're careful to make sure that dependabot-core
never has direct access to any credentials. Instead, all http(s) requests from dependabot-core
are routed through a job-specific proxy server holds these credentials[^1]. The proxy server then injects the credentials for the registry thatβs being accessed for the request.
β Tasks
-
Due to how credentials are handled in our production enviornments, the changes in this PR will not affect requests to private Hex registries. To get authentication right, we would need to implement this logic in the proxy server. The proxy server's source code is private, so this work will have to be done by the Dependabot team.
-
In order to pass along a public key fingerprint to the proxy, we would need to update the private registry schema for Hex. This should be relatively straight-forward. This will also need to be done by the Dependabot team.
π Our ask
We're not clear on the HTTP contract for private Hex servers. We'll need to understand this better in order to update the proxy server. Ideally, this would be some kind of spec for private Hex servers, but short of that we could use documentation. We're hoping that you folks could provide us with an up-to-date spec that you can vouch for.
π Timeline
I hope to not sound hand-wavy here, but the timeline will depend on what's involved in meeting the spec. When we have that, we can figure out what changes need to be made.
[^1]: The proxy server is short-lived, and only accessible for an individual job.
I haven't been actively working in Elixir for a while, and I don't have access to a project w/ oban_pro anymore, but here's some thoughts from cursory research I had done on this.
A good example of how private repos are used are contained in oban pro's installation docs.
Repo config docs are here.
I had done an initial bit of digging on this and while hex repos in dependabot-core today have hard-coded urls based on the package name, the bundler
logic is more robust and I assume could be used as an example of how to implement it in hex. Bundler uses a bundler subprocess to lookup the private repos, and while those helpers don't exist for hex today, I'm sure they could be replicated for hex to use.
The dependency file parser would need to parse the repo that's listed w/ the hex package in the mix.lock, and pass it to get the url. The repo
should be the second to last element of the tuple.
# mix.lock
%{
"oban_pro": {:hex, :oban_pro, "0.12.6", "[trunc]", [:list, :of, :deps], [], "oban", "[trunc]"}
}
mix hex.repo show oban --url
will get you the url that the dependency should be pulled from.
Edit: A lot of this may already be handled by the code in this PR, I completely forgot that this was a PR and not just an issue π
Double Edit: Here's the hexpm specifications
@landongrindheim Thank you for looking into what it will take to complete this. If you need any additional details beyond the hexpm specifications that @BobbyMcWho linked above, please let me know.
π’ Update
I was able to spend some time on this last Friday and paired with @jurre today to wee where we could get. Still can't make any promises re: timeline, but we're much closer than we were π
β In trying to reason through the private repositories spec and this PR, I think we understand the way Hex will handle these requests once this is in place. This package ecosystem seems to work differently from others that we handle, so I want to make sure. Does π sound accurate?
- Make an unauthenticated request to the private registry's
/public_key
endpoint - Verify the validity of the private registry (public key/fingerprint comparisons)
- Hex seems to handle the details once we've compared the public key/fingerprint
- Make authenticated requests against the verified private registry
We'd like to test our internal implementation before shipping this change. We don't have a private Hex server. Are we able to test against the instance listed in the description of this PR?
This package ecosystem seems to work differently from others that we handle, so I want to make sure. Does π sound accurate?
Thatβs accurate as I understand it. There is an exception if the βskip key validationβ option is used, then it doesnβt compare the signature to the private key.
Are we able to test against the instance listed in the description of this PR?
Absolutely! Thatβs precisely why I built and deployed it.
@sorentwo I've been trying to set up a test repository using the private Hex server instance you shared. I forked the repo for the service and deployed it, but have so far been unable to get the dependencies declared properly for Dependabot to update.
I've bypassed the trust mechanisms to get jason
set up in a mix.exs
/mix.lock
, but get the following response when running mix deps.get
:
Request failed (404)
** (Mix) Package fetch failed and no cached copy available ({{url}}/tarballs/jason-1.1.0.tar)
Could you set up a repo that we could fork and have your private Hex server instance running so we can do some tests internally?
One more question: does the request to /public_key
typically require authentication? It seems to on the shared instance, but I wouldn't have expected it because it's a public key. If it does require auth, we'll need to account for that internally, too.