habitat-sim icon indicating copy to clipboard operation
habitat-sim copied to clipboard

unable to find EGL device for CUDA device 0

Open uahic opened this issue 2 years ago • 2 comments

Habitat-Sim version

version: main

I know this issue has been reported several times and is often solved by installing the GPU drivers properly. In my case, I'd like to run Habitat in a Ubuntu 20 or 22 container without GPU support but for some reason I dont get it working anymore without habitat asking for CUDA devices for libEGL. But I had habitat once running on my machine using Docker. I dont know whether I put dependencies in my apt-install list that shouldnt be there (related to gl in some way) or compiled Habitat in a different way.

My laptop contains an Intel onboard graphics and a dedicated Nvidia RTX A2000 GPU. The container is meant to be used for my students who do not have a dedicated GPU.

Side note: Dedicated X11 apps like xclock do work. A new window opens on my host as expected. So I guess indirect-rendering is not the issue here but the way habitat tries to access (direct rendering)

this is how my docker-compose file looks like:


version: '3'
services:
  simulation:
    image: habitat
    build:
      context: .
      dockerfile: Dockerfile
    stdin_open: true
    tty: true
    env_file:
      - .env
    environment:
      - "DISPLAY=${DISPLAY}"
      - "XAUTHORITY=${XAUTHORITY}"
    volumes:
      - /tmp/.X11-unix:/tmp/.X11-unix:rw
      - ~/.Xauthority:/root/.Xauthority:rw
    network_mode: "host"
    devices:
      - "/dev/dri:/dev/dri"

This is how the head of the dockerfile looks like:

FROM ubuntu:20.04

# Set TimeZone
ENV TZ=America/Los_Angeles
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y \
    build-essential \
    apt-utils \
    git \
    curl \
    vim \
    ca-certificates \
    libjpeg-dev \
    libpng-dev \
    libglfw3-dev \
    libglm-dev \
    # libx11-dev \
    libomp-dev \
    x11-apps \
    libegl1-mesa-dev \
    pkg-config \
    wget \
    tzdata \
    zip \
    tk-dev \
    python3-tk \
    python3-pip \
    unzip \
    libxrandr-dev \ 
    libxinerama-dev \ 
    libxcursor-dev \
    libasound2 \
    libatk1.0-0 \
    libatk-bridge2.0-0 \
    libc6 \
    libcairo2 \
    libcups2 \
    libdbus-1-3  \
    libexpat1 \
    libfontconfig1 \
    libgcc1 \
    libgconf-2-4 \
    libgdk-pixbuf2.0-0 \
    libglib2.0-0 \
    libgtk-3-0 \
    libnspr4 \
    libpango-1.0-0 \
    libpangocairo-1.0-0 \
    libstdc++6 \
    # libx11-6 \
    # libx11-xcb1 \
    libxcb1 \
    libxcomposite1 \
    libxcursor1 \
    libxdamage1 \
    libxext6 \
    libxfixes3 \
    libxi6 \
    libxrandr2 \
    libxrender1 \
    libxss1 \
    libxtst6 \
    ca-certificates \
    fonts-liberation \
    libappindicator1 \
    libnss3 \
    lsb-release \
    xdg-utils \
    libxi-dev \ 
    gnupg2  \
    python3-dev \
    ninja-build \
    python-tk \
    software-properties-common \
    wget -y &&\
    rm -rf /var/lib/apt/lists/*

# Install cmake
RUN wget https://github.com/Kitware/CMake/releases/download/v3.14.0/cmake-3.14.0-Linux-x86_64.sh
RUN mkdir /opt/cmake
RUN sh /cmake-3.14.0-Linux-x86_64.sh --prefix=/opt/cmake --skip-license
RUN ln -s /opt/cmake/bin/cmake /usr/local/bin/cmake
RUN cmake --version

RUN python3 -m pip install numpy==1.23.0
RUN git clone --branch stable https://github.com/facebookresearch/habitat-sim.git
RUN cd habitat-sim; python3 -m pip install -r requirements.txt; python3 setup.py install

Hardware:

➜  ~ sudo lshw -C display
[sudo] password for schulze: 
Sorry, try again.
[sudo] password for schulze: 
  *-display                 
       description: VGA compatible controller
       product: GA107GLM [RTX A2000 Mobile]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       logical name: /dev/fb0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list rom fb
       configuration: depth=32 driver=nvidia latency=0 mode=1920x1080 visual=truecolor xres=1920 yres=1080
       resources: iomemory:600-5ff iomemory:610-60f irq:194 memory:ad000000-adffffff memory:6000000000-60ffffffff memory:6100000000-6101ffffff ioport:3000(size=128) memory:ae080000-ae0fffff
  *-display
       description: VGA compatible controller
       product: TigerLake-H GT1 [UHD Graphics]
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       logical name: /dev/fb0
       version: 01
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm vga_controller bus_master cap_list rom fb
       configuration: depth=32 driver=i915 latency=0 resolution=1920,1080
       resources: iomemory:610-60f iomemory:400-3ff irq:193 memory:614c000000-614cffffff memory:4000000000-400fffffff ioport:4000(size=64) memory:c0000-dffff memory:4010000000-4016ffffff memory:4020000000-40ffffffff
➜  ~ 

And this is the output after running the example script, also trying to use other displays:


 ⚡ root  examples  fffa5437 v0.2.3 ?  ✘  DISPLAY=:2 python example.py
sim_cfg.physics_config_file = data/default.physics_config.json
[06:54:19:303357]:[Metadata] AttributesManagerBase.h(380)::createFromJsonOrDefaultInternal : <Dataset>: Proposing JSON name : default.scene_dataset_config.json from original name : default| This file does not exist.
[06:54:19:303419]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (capsule3DSolid:capsule3DSolid_hemiRings_4_cylRings_1_segments_12_halfLen_0.75_useTexCoords_false_useTangents_false) created and registered.
[06:54:19:303439]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (capsule3DWireframe:capsule3DWireframe_hemiRings_8_cylRings_1_segments_16_halfLen_1) created and registered.
[06:54:19:303450]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (coneSolid:coneSolid_segments_12_halfLen_1.25_rings_1_useTexCoords_false_useTangents_false_capEnd_true) created and registered.
[06:54:19:303475]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (coneWireframe:coneWireframe_segments_32_halfLen_1.25) created and registered.
[06:54:19:303483]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (cubeSolid:cubeSolid) created and registered.
[06:54:19:303490]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (cubeWireframe:cubeWireframe) created and registered.
[06:54:19:303502]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (cylinderSolid:cylinderSolid_rings_1_segments_12_halfLen_1_useTexCoords_false_useTangents_false_capEnds_true) created and registered.
[06:54:19:303517]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (cylinderWireframe:cylinderWireframe_rings_1_segments_32_halfLen_1) created and registered.
[06:54:19:303530]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (icosphereSolid:icosphereSolid_subdivs_1) created and registered.
[06:54:19:303543]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (icosphereWireframe:icosphereWireframe_subdivs_1) created and registered.
[06:54:19:303556]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (uvSphereSolid:uvSphereSolid_rings_8_segments_16_useTexCoords_false_useTangents_false) created and registered.
[06:54:19:303567]:[Metadata] AssetAttributesManager.cpp(123)::createObject : Asset attributes (uvSphereWireframe:uvSphereWireframe_rings_16_segments_32) created and registered.
[06:54:19:303576]:[Metadata] AssetAttributesManager.cpp(112)::AssetAttributesManager : Built default primitive asset templates : 12
[06:54:19:303772]:[Metadata] SceneDatasetAttributesManager.cpp(37)::createObject : File (default) not found, so new default dataset attributes created  and registered.
[06:54:19:303776]:[Metadata] MetadataMediator.cpp(120)::createSceneDataset : Dataset default successfully created.
[06:54:19:303786]:[Metadata] AttributesManagerBase.h(380)::createFromJsonOrDefaultInternal : <Physics Manager>: Proposing JSON name : ./data/default.physics_config.json from original name : ./data/default.physics_config.json| This file does not exist.
[06:54:19:303795]:[Metadata] PhysicsAttributesManager.cpp(26)::createObject : File (./data/default.physics_config.json) not found, so new default physics manager attributes created and registered.
[06:54:19:303800]:[Metadata] MetadataMediator.cpp(203)::setActiveSceneDatasetName : Previous active dataset  changed to default successfully.
[06:54:19:303824]:[Metadata] AttributesManagerBase.h(380)::createFromJsonOrDefaultInternal : <Physics Manager>: Proposing JSON name : data/default.physics_config.json from original name : data/default.physics_config.json| This file does not exist.
[06:54:19:303831]:[Metadata] PhysicsAttributesManager.cpp(26)::createObject : File (data/default.physics_config.json) not found, so new default physics manager attributes created and registered.
[06:54:19:303837]:[Metadata] MetadataMediator.cpp(66)::setSimulatorConfiguration : Set new simulator config for scene/stage : data/scene_datasets/habitat-test-scenes/skokloster-castle.glb and dataset : default which is currently active dataset.
Platform::WindowlessEglApplication: eglQueryDeviceStringEXT(EGLDevice=0): EGL_EXT_device_drm
eglQueryDeviceAttribEXT(): eglQueryDeviceAttribEXT
Platform::WindowlessEglApplication: eglQueryDeviceStringEXT(EGLDevice=1): EGL_EXT_device_drm
eglQueryDeviceAttribEXT(): eglQueryDeviceAttribEXT
Platform::WindowlessEglApplication: eglQueryDeviceStringEXT(EGLDevice=2): EGL_MESA_device_software
eglQueryDeviceAttribEXT(): eglQueryDeviceAttribEXT
Platform::WindowlessEglApplication::tryCreateContext(): unable to find EGL device for CUDA device 0
WindowlessContext: Unable to create windowless context
 ⚡ root  examples  fffa5437 v0.2.3 ?  ✘  

I read some introductions to GL, Mesa etc. (e.g. https://dzone.com/articles/hardware-accelerated-opengl-rendering-in-a-linux-container) but I still dont fully get why it isnt working, particularly when I specified to use X-server on DISPLAY 2 (which is intel graphics)

Thanks for any help : - )

uahic avatar Feb 10 '23 15:02 uahic

Hey @uahic,

The logs might provide more information on the issue when running with the environment variable MAGNUM_LOG=verbose. Hopefully this will give some pointers.

I had habitat once running on my machine using Docker

Could test an older version of Habitat? This would rule out whether the issue comes from a regression or a new environment issue.

0mdc avatar Feb 20 '23 23:02 0mdc

Hi @0mdc sure! I will try that out today or tomorrow and come back.

uahic avatar Feb 24 '23 12:02 uahic