xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

xgboost 2.0.1 is breaking on rootless docker.

Open prafullat opened this issue 10 months ago • 3 comments

On rootless docker, path cgroup is readable only by root user. image

As such, if any non-root users even tries to import xgboost, it fails with permission error. Permission denied [/sys/fs/cgroup/cpu.max]

Can you please check and avoid this error is this path is not accessible? We are not able to use xgboost in the rootless docker container.

prafullat avatar May 03 '24 05:05 prafullat

@prafullat Can you post the full stack trace for the permission error?

hcho3 avatar May 03 '24 22:05 hcho3

Will try to reproduce once I get back to work. At the moment my guess is this line to be blamed https://github.com/dmlc/xgboost/blob/5e64276a9b95df57e6dd8f9e63347636f4e5d331/src/common/threading_utils.cc#L78

trivialfis avatar May 04 '24 12:05 trivialfis

I have tried to launch an image with and without gosu, running in rootless mode of docker, but couldn't reproduce the issue. @hcho3 might have better insight here.

$ docker info | grep "Root"
WARNING: No cpuset support
WARNING: No io.weight support
WARNING: No io.weight (per device) support
WARNING: No io.max (rbps) support
WARNING: No io.max (wbps) support
 Docker Root Dir: /home/ubuntu/.local/share/docker
WARNING: No io.max (riops) support
WARNING: No io.max (wiops) support

trivialfis avatar May 06 '24 19:05 trivialfis

Feel free to reopen if there is further information. Will try to fix it if there's a reproducer.

trivialfis avatar Jul 12 '24 08:07 trivialfis

I am having the exact issue with a non-root user importing the package:

>>> import xgboost terminate called after throwing an instance of 'std::filesystem::__cxx11::filesystem_error' what(): filesystem error: status: Permission denied [/sys/fs/cgroup/cpu.max] Aborted (core dumped)

zhangzzk avatar Jul 17 '24 09:07 zhangzzk

Could you please share how you installed docker (or any other environment)?

trivialfis avatar Jul 17 '24 09:07 trivialfis

Thank you for the quick reply. Sorry I am a very beginner at this. I am a non-root user on a shared computing cluster and I am not the one setting up the environment. Could you perhaps give more specific instructions on the information I can provide?

zhangzzk avatar Jul 17 '24 10:07 zhangzzk

Could you perhaps give more specific instructions on the information I can provide?

Unfortunately, I want these instructions as well. ;-( Maybe you can share the result of docker info so that I can compare it with mine? @hcho3 might be able to provide more insights into the system setup.

trivialfis avatar Jul 19 '24 09:07 trivialfis

I think I am not running on a docker so docker info is not a valid command.

zhangzzk avatar Jul 22 '24 14:07 zhangzzk

I will open a PR to catch all potential exceptions in that function.

trivialfis avatar Jul 22 '24 17:07 trivialfis