calvin icon indicating copy to clipboard operation
calvin copied to clipboard

failed to EGL with glad.

Open xiaofeifei-1 opened this issue 1 year ago • 4 comments

systerm: Ubuntu-20.04 GPU: rtx2080Ti

when runing training.py, there is an error:

[2024-08-04 17:45:44,106][calvin_agent.datasets.base_dataset][INFO] - finished loading dataset
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name               | Type                   | Params
--------------------------------------------------------------
0 | perceptual_encoder | ConcatEncoders         | 174 K 
1 | plan_proposal      | PlanProposalNetwork    | 13.9 M
2 | plan_recognition   | PlanRecognitionNetwork | 36.0 M
3 | visual_goal        | VisualGoalEncoder      | 4.4 M 
4 | language_goal      | LanguageGoalEncoder    | 5.1 M 
5 | action_decoder     | LogisticPolicyNetwork  | 13.8 M
--------------------------------------------------------------
73.2 M    Trainable params
0         Non-trainable params
73.2 M    Total params
146.424   Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]pybullet build time: Nov 28 2023 23:51:11
[2024-08-04 17:45:46,190][calvin_agent.wrappers.calvin_env_wrapper][INFO] - EGL_DEVICE_ID 0 <==> CUDA_DEVICE_ID 0
argv[0]=--width=200
argv[1]=--height=200
[2024-08-04 17:45:46,261][calvin_env.envs.play_table_env][INFO] - Loading EGL plugin (may segfault on misconfigured systems)...
failed to EGL with glad.
/home/xxx/anaconda3/envs/calvin_venv/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 8 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

xiaofeifei-1 avatar Aug 04 '24 09:08 xiaofeifei-1

calvin/calvin_env/egl_check$ python list_egl_options.py
----------Default-------------
Starting EGL query
Loaded EGL 1.5 after reload.
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 535.183.01
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
Completeing EGL query
b'EGL device choice: -1 of 4.\n'
number of EGL devices: 4
----------Option #1 (id=0)-------------
Starting EGL query
EGL device choice: 0 of 4 (from EGL_VISIBLE_DEVICE)
Loaded EGL 1.5 after reload.
GL_VENDOR=NVIDIA Corporation
CUDA_DEVICE=0
GL_RENDERER=NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 535.183.01
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
Completeing EGL query

----------Option #2 (id=1)-------------
Starting EGL query
EGL device choice: 1 of 4 (from EGL_VISIBLE_DEVICE)
libEGL warning: DRI2: failed to create dri screen
libEGL warning: DRI2: failed to create dri screen
eglInitialize() failed with error: 3008

----------Option #3 (id=2)-------------
Starting EGL query
EGL device choice: 2 of 4 (from EGL_VISIBLE_DEVICE)
Loaded EGL 1.5 after reload.
GL_VENDOR=Intel
GL_RENDERER=Mesa Intel(R) UHD Graphics 630 (CFL GT2)
GL_VERSION=4.6 (Core Profile) Mesa 21.2.6
GL_SHADING_LANGUAGE_VERSION=4.60
Completeing EGL query

----------Option #4 (id=3)-------------
Starting EGL query
EGL device choice: 3 of 4 (from EGL_VISIBLE_DEVICE)
Loaded EGL 1.5 after reload.
GL_VENDOR=Mesa/X.org
GL_RENDERER=llvmpipe (LLVM 12.0.0, 256 bits)
GL_VERSION=4.5 (Core Profile) Mesa 21.2.6
GL_SHADING_LANGUAGE_VERSION=4.50
Completeing EGL query

What can I do to solve "failed to EGL with glad"?

xiaofeifei-1 avatar Aug 04 '24 10:08 xiaofeifei-1

I am facing same error

pranavgundewar avatar Aug 22 '24 23:08 pranavgundewar

If you want to render in headless mode, make sure $DISPLAY environment variable is unset, otherwise you might have error Failed to EGL with glad, because EGL is sensitive to $DISPLAY environment variable. unset $DISPLAY or unset DISPLAY

AddASecond avatar Sep 04 '24 11:09 AddASecond

Although I see the 'libEGL' libraries, I am still facing the same error even when I run 'list_egl_options.py' for verification. I have seen in other places that the 'libEGL_nvidia.so*' libraries are also needed. Is that the case?

System: Ubuntu 20.04 (WSL) GPU: RTX 8000

sb93 avatar Oct 11 '24 20:10 sb93

(mdt_env) zzc@zzc-System-Product-Name:~/mydata/mdt/mdt_policy/calvin_env/egl_check$ python list_egl_options.py ----------Default------------- Starting EGL query Loaded EGL 1.5 after reload. GL_VENDOR=Mesa/X.org GL_RENDERER=llvmpipe (LLVM 12.0.0, 256 bits) GL_VERSION=4.5 (Core Profile) Mesa 21.2.6 GL_SHADING_LANGUAGE_VERSION=4.50 Completeing EGL query b'EGL device choice: -1 of 2.\nlibEGL warning: DRI2: failed to create dri screen\nlibEGL warning: DRI2: failed to create dri screen\n' number of EGL devices: 2 ----------Option #1 (id=0)------------- Starting EGL query EGL device choice: 0 of 2 (from EGL_VISIBLE_DEVICE) libEGL warning: DRI2: failed to create dri screen libEGL warning: DRI2: failed to create dri screen eglInitialize() failed with error: 3008

----------Option #2 (id=1)------------- Starting EGL query EGL device choice: 1 of 2 (from EGL_VISIBLE_DEVICE) Loaded EGL 1.5 after reload. GL_VENDOR=Mesa/X.org GL_RENDERER=llvmpipe (LLVM 12.0.0, 256 bits) GL_VERSION=4.5 (Core Profile) Mesa 21.2.6 GL_SHADING_LANGUAGE_VERSION=4.50 Completeing EGL query

I also have the same error

ZZC-CN avatar Mar 17 '25 09:03 ZZC-CN

I also have the same error and unset DISPLAY can help me. Thx :D

CostaliyA avatar Mar 18 '25 09:03 CostaliyA

I also have the same error and unset DISPLAY can help me. Thx :D

I tried,but it is unuseful. also thank you

ZZC-CN avatar Mar 18 '25 12:03 ZZC-CN

@ZZC-CN Recently, to convenient for user to use my repo, I built a docker for my work Corki, I faced the same problem as you, and I finally sovled it, you can see my Dockerfile guidance Corki and my solvement for this problem, I think there are two main reasons:

  1. Some packages are missing, check the dockerfile here:Dockerfile
  2. Missing EGL ICD configuration file, I don't know why this file still missing although I run the docker with argument-e NVIDIA_DRIVER_CAPABILITIES=all, so I add it with scirpts see:prepare.sh

After finishing these process, I get the right result, that calvin use gpu to render:

Loading robot-flamingo checkpoint from /modelzoo/checkpoint_gripper_post_hist_1_aug_10_4_traj_cons_ws_12_mpt_dolly_3b_9_fur_step_0.pth0.pth
argv[0]=--width=200
argv[1]=--height=200
EGL device choice: -1 of 8.
Loaded EGL 1.5 after reload.
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=NVIDIA A100-SXM4-40GB/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 550.90.07
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
Version = 3.3.0 NVIDIA 550.90.07
Vendor = NVIDIA Corporation
Renderer = NVIDIA A100-SXM4-40GB/PCIe/SSE2
logging to /modelzoo/checkpoint_gripper_post_hist_1_aug_10_4_traj_cons_ws_12_mpt_dolly_3b_9_fur_step_0_action_num_5_h
  0%|          | 0/1000 [00:00<?, ?it/s]ven = NVIDIA Corporation
ven = NVIDIA Corporation
0/1|1/5 : 0.0% | 2/5 : 0.0% | 3/5 : 0.0% | 4/5 : 0.0% | 5/5 : 0.0% ||:   0%|          | 1/1000 [00:14<4:04:27, 14.68s/it]

I'm not sure if this will solve your problem, but I welcome people who encounter this problem to try it. If it works for everyone, I'll consider submitting a pull request for docker installation guidance to the calvin repo.

hyy02 avatar Apr 15 '25 03:04 hyy02

Although I see the 'libEGL' libraries, I am still facing the same error even when I run 'list_egl_options.py' for verification. I have seen in other places that the 'libEGL_nvidia.so*' libraries are also needed. Is that the case?

System: Ubuntu 20.04 (WSL) GPU: RTX 8000

@sb93 I also have the same error. Did you successfully download the 'libEGL_nvidia.so*' libraries? I used this command to check the download status of EGL and found that Nvidia's EGL is missing.

Image How did you download libEGL_nvidia.so.0?

1650292274 avatar Apr 27 '25 12:04 1650292274