WaveRNN-Pytorch icon indicating copy to clipboard operation
WaveRNN-Pytorch copied to clipboard

C++ Vocoder issue - possibly memory allocation

Open tiberiu44 opened this issue 5 years ago • 2 comments

Hi @geneing

First of all, congratulations on this awesome work! I've been experiencing a possible bug in the c++ implementation. I tried the model on a couple of machines at home and noticed some things:

  1. The C++ implementation seems noisier than the Pytorch one. I think this is normal since the -ffastmath flag. However, it could also be related to this:
  2. On my laptop I get Segmentation Fault when I load the model. This doesn't happen in all cases, but I usually get weird characters in the layer names. I ran valgrind to get some trace and I got the following messages:
Loading:Conv1d(64, 64, kernel_size=(1,), stride=(1,), bias=False)
Loading:BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_runn
Loading:Conv1d(64, 64, kernel_size=(1,), stride=(1,), bias=False)
Loading:BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_runn
Loading:Conv1d(64, 64, kernel_size=(1,), stride=(1,), bias=False)
Loading:BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_runn
Loading:Conv1d(64, 64, kernel_size=(1,), stride=(1,), bias=False)
Loading:BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_runn
Loading:Conv1d(64, 64, kernel_size=(1,), stride=(1,), bias=False)
Loading:BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_runn
Loading:Conv1d(64, 64, kernel_size=(1,), stride=(1,))
Loading:Stretch2d()
==13230== Invalid read of size 8
==13230==    at 0x99858C1: BatchNorm1dLayer::apply(Eigen::Matrix<float, -1, -1, 1, -1, -1> const&) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x1000000FF: ???
==13230==    by 0x10F434EA343443FF: ???
==13230==    by 0x4EA583F: ???
==13230==    by 0x998B66E: TorchLayer::loadNext(_IO_FILE*) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x9991483: Vocoder::loadWeights(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x998B997: void pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<void, Vocoder, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, pybind11::name, pybind11::is_method, pybind11::sibling>(void (Vocoder::*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(Vocoder*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}, void, Vocoder*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11::cpp_function::initialize<void, Vocoder, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, pybind11::name, pybind11::is_method, pybind11::sibling>(void (Vocoder::*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(Vocoder*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}&&, void (*)(Vocoder*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x9994239: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x5C80FA: _PyMethodDef_RawFastCallKeywords (in /home/tibi/Projects/WaveRNN-Pytorch/venv/bin/python3)
==13230==    by 0x5CA038: _PyObject_FastCallKeywords (in /home/tibi/Projects/WaveRNN-Pytorch/venv/bin/python3)
==13230==    by 0x5367D0: ??? (in /home/tibi/Projects/WaveRNN-Pytorch/venv/bin/python3)
==13230==    by 0x53D360: _PyEval_EvalFrameDefault (in /home/tibi/Projects/WaveRNN-Pytorch/venv/bin/python3)
==13230==  Address 0x9 is not stack'd, malloc'd or (recently) free'd
==13230== 
==13230== 
==13230== Process terminating with default action of signal 11 (SIGSEGV)
==13230==  Access not within mapped region at address 0x9
==13230==    at 0x99858C1: BatchNorm1dLayer::apply(Eigen::Matrix<float, -1, -1, 1, -1, -1> const&) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x1000000FF: ???
==13230==    by 0x10F434EA343443FF: ???
==13230==    by 0x4EA583F: ???
==13230==    by 0x998B66E: TorchLayer::loadNext(_IO_FILE*) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x9991483: Vocoder::loadWeights(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x998B997: void pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<void, Vocoder, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, pybind11::name, pybind11::is_method, pybind11::sibling>(void (Vocoder::*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(Vocoder*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}, void, Vocoder*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11::cpp_function::initialize<void, Vocoder, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, pybind11::name, pybind11::is_method, pybind11::sibling>(void (Vocoder::*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(Vocoder*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}&&, void (*)(Vocoder*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x9994239: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /home/tibi/Projects/WaveRNN-Pytorch/library/build/WaveRNNVocoder.cpython-37m-x86_64-linux-gnu.so)
==13230==    by 0x5C80FA: _PyMethodDef_RawFastCallKeywords (in /home/tibi/Projects/WaveRNN-Pytorch/venv/bin/python3)
==13230==    by 0x5CA038: _PyObject_FastCallKeywords (in /home/tibi/Projects/WaveRNN-Pytorch/venv/bin/python3)
==13230==    by 0x5367D0: ??? (in /home/tibi/Projects/WaveRNN-Pytorch/venv/bin/python3)
==13230==    by 0x53D360: _PyEval_EvalFrameDefault (in /home/tibi/Projects/WaveRNN-Pytorch/venv/bin/python3)
==13230==  If you believe this happened as a result of a stack
==13230==  overflow in your program's main thread (unlikely but
==13230==  possible), you can try to increase the size of the
==13230==  main thread stack using the --main-stacksize= flag.
==13230==  The main thread stack size used in this run was 8388608.
==13230== 
==13230== HEAP SUMMARY:
==13230==     in use at exit: 14,004,524 bytes in 9,510 blocks
==13230==   total heap usage: 52,675 allocs, 43,165 frees, 126,334,690 bytes allocated
==13230== 
==13230== LEAK SUMMARY:
==13230==    definitely lost: 32 bytes in 1 blocks
==13230==    indirectly lost: 33 bytes in 2 blocks
==13230==      possibly lost: 774,467 bytes in 1,082 blocks
==13230==    still reachable: 13,229,992 bytes in 8,425 blocks
==13230==                       of which reachable via heuristic:
==13230==                         length64           : 229,872 bytes in 462 blocks
==13230==         suppressed: 0 bytes in 0 blocks
==13230== Rerun with --leak-check=full to see details of leaked memory
==13230== 
==13230== Use --track-origins=yes to see where uninitialised values come from
==13230== For lists of detected and suppressed errors, rerun with: -s
==13230== ERROR SUMMARY: 28873 errors from 295 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

I tried to look over the C++ code, but I'm having a hard time since I haven't done this in quite a while.

Again, really nice work and looking forward to your answer.

tiberiu44 avatar Feb 21 '20 17:02 tiberiu44

I have the same problem. Have you been able to solve it?

ferqui avatar Jul 27 '20 17:07 ferqui

I just found the problem. The method Stretch2dLayer *Stretch2dLayer::loadNext(FILE *fd) is suppose to return a Stretch2dLayer pointer but it does not return anything. So just adding return this; at the end of the method it should works.

ferqui avatar Aug 01 '20 19:08 ferqui