fairseq2
fairseq2 copied to clipboard
Unable to run fairseq2 inside nvidia/cuda docker container due to libfairseq2n.so.0: undefined symbol error
Describe the bug:
Our Python application utilizes a seamless model to perform speech-to-text transformations. To host the application, we use an Azure virtual machine (Standard NC24ads A100 v4 (24 vcpus, 220 GiB memory)) running Ubuntut 22.04. We have docker installed on the virtual machine, and the application is deployed using docker.
We have the following docker file
# Start with a base image that includes CUDA
FROM nvidia/cuda:12.1.0-base-ubuntu20.04
ENV CACHE_ROOT_FOLDER=/cache/verse-extractor-cache
# Set the timezone to UTC
ENV TZ=UTC
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
# Install some necessary tools
RUN apt-get update && \
apt-get install -y ffmpeg && \
apt-get install -y python3 && \
apt-get install -y python3-pip && \
# Install Git
apt-get install -y git && \
# Clean up to reduce container size
rm -rf /var/lib/apt/lists/*
# Set up a working directory
WORKDIR /app
# Copy requirements.txt and install Python packages using pip
COPY requirements.txt ./
RUN pip3 install -r requirements.txt
# Install fairseq2
RUN pip install fairseq2 --extra-index-url https://fair.pkg.atmeta.com/fairseq2/whl/pt2.2.0/cu121
# Copy the rest of your app
COPY . /app
# Set the command to run your app
CMD ["python3", "app.py"]
Our requirements.txt
looks like this
openai-whisper
flask
sentence-transformers
pandas
yt-dlp
pip-system-certs
tiktoken
triton
torch
torchaudio
ffmpeg
pydub
openpyxl
sentencepiece
git+https://github.com/facebookresearch/seamless_communication.git
We use docker-compose to build and run the image
version: '3.8'
services:
verse-extractor-api:
build:
context: .
dockerfile: Dockerfile
ports:
- "8000:8000"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [ gpu ]
I have the following simple app.py
file
from seamless_communication.inference import Translator
import logging
logging.basicConfig(level=logging.INFO)
if __name__ == '__main__':
logging.info("Starting the app")
When I build the docker image using sudo docker-compose up --build -d
command and then I view the logs sudo docker-compose logs
I see the following error message
Traceback (most recent call last):
File "app.py", line 1, in <module>
from seamless_communication.inference import Translator
File "/usr/local/lib/python3.8/dist-packages/seamless_communication/__init__.py", line 9, in <module>
from fairseq2.assets import FileAssetMetadataProvider, asset_store
File "/usr/local/lib/python3.8/dist-packages/fairseq2/assets/__init__.py", line 7, in <module>
from fairseq2.assets.card import AssetCard as AssetCard
File "/usr/local/lib/python3.8/dist-packages/fairseq2/assets/card.py", line 28, in <module>
from fairseq2.data.typing import is_string_like
File "/usr/local/lib/python3.8/dist-packages/fairseq2/data/__init__.py", line 7, in <module>
from fairseq2.data.cstring import CString as CString
File "/usr/local/lib/python3.8/dist-packages/fairseq2/data/cstring.py", line 61, in <module>
from fairseq2n.bindings.data.string import CString as CString
ImportError: /usr/local/lib/python3.8/dist-packages/fairseq2n/lib/libfairseq2n.so.0: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
What am I doing wrong?
Describe how to reproduce:
- Pull the
app.py
,Dockerfile
,requirements.txt
,docker-compose.yml
files I've mentioned above - Copy them to a single directory
- Make sure you have docker and docker compose installed
- Run
docker-compose up --build -d
Describe the expected behavior:
The application should be launched successfully without any errors
Environment: Ubuntu 22.04 fairseq2 - 0.2.0, PyTorch - 2.2.0 Python - 3.8 CUDA - 12.1 GPU - NVIDIA A100