LightGBM
LightGBM copied to clipboard
Fix DockerFile GPU CUDA GPG Keys
The Docker in the repository no longer works, as NVIDIA updated their GPG keys in Apr 2022.
The dockerfile has been updated here
Hello @jameslamb ,
Thank you for your comment on the PR. I understand that master
-> master
PRs are not encouraged for this Repo. Would you like me to close this PR, and then make a new one with a different branch pointing the LightGBM/master?
Also, I would like to highlight that the current Docker
image does not work, hence my changes to the GPG key. Let me know if you would like me to change anything :)
then make a new one with a different branch
nope, it's ok! You can keep this one. But once this is merged, I recommend deleting and re-creating your fork. (or using git reset
+ git push --force
to rewrite history on master
of your fork)
Thank you for your reply, I understand what you meant! I did not have any other code modified in the origin of my LightGBM fork, which is why I had made it from master. In the future, I will make the PR from a branch created out of master/main :)
For testing Docker images: One quick idea is trying it on Google Colab / Free GPU instances on Kaggle, I have tested it on my environment with an RTX 2060 + official NVIDIA drivers (Ubuntu), and it seems to work great, I cannot think of any way to write tests for docker images.
I personally don't like this line the Dockerfile I had to add (looks too hardcoded):
RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A4B469963BF863CC
Though it errors out without it on any system due to NVIDIA deprecating its keys.
Edit: I have removed the hardcoded condition to syntax NVIDIA recommends on their blog post, to make it more maintainable. @jameslamb
Edit: I have removed the hardcoded condition to syntax NVIDIA recommends on their blog post
Here it is: https://developer.nvidia.com/blog/updating-the-cuda-linux-gpg-repository-key/.
@jameslamb I have made the changes, I am very sorry for the delay as I was travelling. Making the other pull request too (I guess it is better to make it after this one is merged).
No problem! Thanks very much.
I started testing this on a g4dn
instance on AWS last night. Didn't quite finish, will try in the next few days.
@Arka161 could you clarify what specifically you mean by "The Docker in the repository no longer works"? For example, does it fail to build? fail at runtime? something else?
When I was testing on an AWS EC2 instance last night, I was able to build an image from docker/gpu/dockerfile.gpu
on latest master
, without any modifications.
Hello @jameslamb : I get this kind of an error when I run the GPU Dockerfile on the master branch of LightGBM.
Basically, something like:
W: GPG error: ___ trusty InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY ___
I got this on my server and my personal Ubuntu machine, and I was unable to build the image. It appeared that NVIDIA changed their CUDA Linux GPG keys, which was the fix for the error.
The error gets fixed with the lines I added where I update the GPG keys as per their suggestions in their blog (using their cuda-keyring
). This way, we will always have this Dockerfile up to date and never face GPG key issues on any machine.
Ok thanks for that. I saw that exact same warning (that is what the W:
means), but was still able to build the image on Amazon Linux.
~Are you working in an environment where anything writing to stderr causes a process to exit with an error? Like on Windows using Powershell?~ Sorry, just saw your comment says "Ubuntu".
Anyway, I'll come back to this some time in the next few days, once I've had a chance to finish testing, with a reproducible example and a more thorough review. Thanks for your patience.
I'm going to close this PR based on https://github.com/microsoft/LightGBM/pull/5369#pullrequestreview-1064356007 and due to lack of response.
@Arka161 if you see an issue with the conclusions in that comment please do leave a comment here and we can re-open this pull request.
Thanks for your interest in LightGBM, come back and contribute any time!
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.