audio icon indicating copy to clipboard operation
audio copied to clipboard

RNN Transducer Loss

Open vincentqb opened this issue 4 years ago • 1 comments

This issue is to track the follow-up work to #1137, which introduced rnnt_loss and RNNTLoss as a prototype in torchaudio.prototype.transducer using HawkAaron's warp-transducer.

  • Update documentation
  • Extend guards for prototype
    • [ ] Guard prototype python files by omitting them from torchaudio, see also https://github.com/pytorch/audio/pull/1137#discussion_r551496192
    • [x] Guard building third party transducer even if not added as an extension (#1159)
    • [x] Enable building transducer in nightlies only, disable for release.
  • Update building process
    • [x] Pass along the DEBUG flag to cmake
    • [x] Remove hardcoded O2/O3 optimization, see https://github.com/pytorch/audio/pull/1137#discussion_r551498022 (#1159)
    • [x] Build within same folders as libsox, https://github.com/pytorch/audio/pull/1137#discussion_r551499829 and https://github.com/pytorch/audio/pull/1137#discussion_r551556305 (#1159)
    • [x] Move libsox to a third_party subfolder as suggested in https://github.com/pytorch/audio/pull/1137#discussion_r550321378 (#1161).
    • [x] Add GPU implementation and compilation. (see https://github.com/pytorch/audio/pull/1483)
    • [x] Add USE_CUDA option for user: build currently depends on presence of device, see here, and pytorch.
    • [ ] Add CUDA build binaries, https://github.com/pytorch/audio/pull/1497
  • Modernization
    • [x] Migrate the checks to C++.
    • [x] Add autograd test https://github.com/pytorch/audio/pull/1532
    • [x] Add Torchscriptability test (attempt, internal).
    • [ ] Investigate using AT_DISPATCH_FLOATING_TYPES.
    • [x] Update bindings to remove pytorch deprecation warnings. (#1160)
    • [x] Refactor and update the API, see warprnnt and internal.
    • [x] Add support for float16.
    • [ ] rnnt loss should not capture the gradient here. (Should rnnt loss custom C++ autograd function return the gradient?)
    • [x] Remove numpy test utilities from tests
    • [ ] Replace change of parameter to assertion here

cc @astaff, internal

vincentqb avatar Feb 04 '21 22:02 vincentqb

Is there a plan to support the packed layout logits of RNNT loss?

Ref: Sec 3.1 https://arxiv.org/abs/1909.12415

maxwellzh avatar Jun 17 '22 07:06 maxwellzh