dali_backend
dali_backend copied to clipboard
The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.
NOTE: dali_backend is available in tritonserver-20.11 and later
:exclamation: IMPORTANT :exclamation:
dali_backend is new and rapidly growing. Official tritonserver releases might be behind
on some features and bug fixes. We encourage you to use the latest version of dali_backend.
Docker build section explains, how to build a tritonserver docker
image with main branch of dali_backend and DALI nightly release. This is a way to
get daily updates!
DALI TRITON Backend
This repository contains code for DALI Backend for Triton Inference Server.

NVIDIA DALI (R), the Data Loading Library, is a collection of highly optimized building blocks, and an execution engine, to accelerate the pre-processing of the input data for deep learning applications. DALI provides both the performance and the flexibility to accelerate different data pipelines as one library. This library can then be easily integrated into different deep learning training and inference applications, regardless of used deep learning framework.
To find out more about DALI please refer to our main page. Getting started and Tutorials will guide you through your first steps and Supported operations will help you put together GPU-powered data processing pipelines.
See any bugs?
Feel free to post an issue here or in DALI's github repository.
How to use?
-
DALI data pipeline is expressed within Triton as a Model. To create such Model, you have to put together a DALI Pipeline in Python. Then, you have to serialize it (by calling the Pipeline.serialize method) or use the Autoserialization to generate a Model file. As an example, we'll use simple resizing pipeline:
import nvidia.dali as dali from nvidia.dali.plugin.triton import autoserialize @autoserialize @dali.pipeline_def(batch_size=256, num_threads=4, device_id=0) def pipe(): images = dali.fn.external_source(device="cpu", name="DALI_INPUT_0") images = dali.fn.image_decoder(images, device="mixed") images = dali.fn.resize(images, resize_x=224, resize_y=224) return images -
Model file shall be incorporated in Triton's Model Repository. Here's the example:
model_repository └── dali ├── 1 │ └── model.dali └── config.pbtxt -
As it's typical in Triton, your DALI Model file shall be named
model.dali. You can override this name in the model configuration, by settingdefault_model_filenameoption. Here's the wholeconfig.pbtxtwe use for theResizePipelineexample:name: "dali" backend: "dali" max_batch_size: 256 input [ { name: "DALI_INPUT_0" data_type: TYPE_UINT8 dims: [ -1 ] } ] output [ { name: "DALI_OUTPUT_0" data_type: TYPE_FP32 dims: [ 224, 224, 3 ] } ]
Autoserialization
When using DALI Backend in Triton, user has to provide a DALI model in the Model Repository.
A canonical way of expressing a model is to include a serialized DALI model file there and
naming the file properly (model.dali by default). The issue that arises from storing model
in a serialized file is that, after serialization, the model is obscure and almost impossible
to read anymore. Autoserialization feature allows user to express the model in Python code in
the model repository. With the Python-defined model, DALI Backend uses internal serialization
mechanism and exempts user from manual serialization.
To use the autoserialization feature, user needs to put a Python-definition of the DALI pipeline
inside the model file (model.dali by default, but the default file name can be configured
in the config.pbtxt). Such pipeline definition has to be decorated with @autoserialize,
e.g.:
import nvidia.dali as dali
@dali.plugin.triton.autoserialize
@dali.pipeline_def(batch_size=3, num_threads=1, device_id=0)
def pipe():
'''
An identity pipeline with autoserialization enabled
'''
data = dali.fn.external_source(device="cpu", name="DALI_INPUT_0")
return data
Proper DALI pipeline definition in Python, together with autoserialization, shall meet the following conditions:
- Only a
pipeline_defcan be decorated withautoserialize. - Only one pipeline definition may be decorated with
autoserializein a given model version.
While loading a model file, DALI Backend follows the precedence:
- First, DALI Backend tries to load a serialized model from the user-specified model location in
default_model_filenameproperty (model.daliif not specified explicitly); - If the previous fails, DALI Backend tries to load and autoserialize a Python pipeline
definition from the user-specified model location. Important: In this case we require, that the file name with the model definition ends with
.py, e.g.mymodel.py; - If the previous fails, DALI Backend tries to load and autoserialize a Python pipeline
definition from the
dali.pyfile in a given model version.
If you did not tweak a model path definition in the config.pbtxt file, you should follow the rule of thumb:
- If you have a serialized pipeline, call the file
model.daliand put it into the model repository, - If you have a python definition of a pipeline, which shall be autoserialized, call it
dali.py.
Tips & Tricks:
- Currently, the only way to pass an input to the DALI pipeline from Triton is to use the
fn.external_sourceoperator. Therefore, there's a high chance, that you'll want to use it to feed the encoded images (or any other data) into DALI. - Give your
fn.external_sourceoperator the same name you give to the Input inconfig.pbtxt.
Known limitations:
- DALI's
ImageDecoderaccepts data only from the CPU - keep this in mind when putting together your DALI pipeline. - Triton accepts only homogeneous batch shape. Feel free to pad your batch of encoded images with zeros
- Due to DALI limitations, you might observe unnaturally increased memory consumption when
defining instance group for DALI model with higher
countthan 1. We suggest using default instance group for DALI model.
How to build?
Docker build
Building DALI Backend with docker is as simple as:
git clone --recursive https://github.com/triton-inference-server/dali_backend.git
cd dali_backend
docker build -f docker/Dockerfile.release -t tritonserver:dali-latest .
And tritonserver:dali-latest becomes your new tritonserver docker image
Bare metal
Prerequisites
To build dali_backend you'll need CMake 3.17+
Using fresh DALI release
On the event you'd need to use newer DALI version than it's provided in tritonserver image,
you can use DALI's nightly builds.
Just install whatever DALI version you like using pip (refer to the link for more info how to do it).
In this case, while building dali_backend, you'd need to pass -D TRITON_SKIP_DALI_DOWNLOAD=ON
option to your CMake build. dali_backend will find the latest DALI installed in your system and
use this particular version.
Building
Building DALI Backend is really straightforward. One thing to remember is to clone
dali_backend repository with all the submodules:
git clone --recursive https://github.com/triton-inference-server/dali_backend.git
cd dali_backend
mkdir build
cd build
cmake ..
make
The building process will generate unittest executable.
You can use it to run unit tests for DALI Backend