Bartłomiej Różański

Results 7 comments of Bartłomiej Różański

that would be super cool, I thought about words counter but timestamp could be also easily consumed by external tools

@gkucsko any chance to bring it up? I wonder how to accurately adjust seconds to resulted samples, apologises for probably naive question but could it be calculated of generated audio...

It would be great to run Bark in Elixir, also recently this TTS model brought a lot of attention https://github.com/collabora/WhisperSpeech

I tried to port Bark and later on WhisperSpeech, they use multiple models to convert text to semantics, semantics to audio and encode... anyway there are more promising models recently...

@michelson not yet but working on it, this models aren't using standard layers or if at all they are in pickle format, I needed to move back to understand simpler...

I'm currently playing around Tacotron 2 text-to-speech and since it's simplest TTS I've found I'm trying to reproduce it in Elixir, I used `nx_signal` to process audio files and generate...

I was thinking it might be one of torchaudio vocoders like Griffin-Lim(outputs sounds robotic) or WaveRNN(most likely this) or Nvidia Waveglow to turn mel spectograms into audio, but I just...