DALI icon indicating copy to clipboard operation
DALI copied to clipboard

Improve performance of ImageDecoder

Open jantonguirao opened this issue 7 months ago • 50 comments

Category:

Other Performance

Description:

  • Populate the CUDA event pool with a few events per thread, to avoid creation of threads during the execution of the pipeline
  • Use task priority in the thread pool used by nvimagecodec, so that tasks are executed on FIFO order
  • Fix the preallocated_batch_size to take into account the hw_load. Not doing so, causes cuMemFree calls during the pipeline run.

Additional information:

Affected modules and functionalities:

ImageDecoder mostly (any operator that has a mixed backend thread pool)

  • ImageDecoder

Key points relevant for the review:

Tests:

  • [x] Existing tests apply
  • [ ] New tests added
    • [ ] Python tests
    • [ ] GTests
    • [ ] Benchmark
    • [ ] Other
  • [ ] N/A

Checklist

Documentation

  • [x] Existing documentation applies
  • [ ] Documentation updated
    • [ ] Docstring
    • [ ] Doxygen
    • [ ] RST
    • [ ] Jupyter
    • [ ] Other
  • [ ] N/A

DALI team only

Requirements

  • [ ] Implements new requirements
  • [ ] Affects existing requirements
  • [x] N/A

REQ IDs: N/A

JIRA TASK: N/A

jantonguirao avatar Jul 22 '24 12:07 jantonguirao