epub_to_audiobook icon indicating copy to clipboard operation
epub_to_audiobook copied to clipboard

feat: local tts support

Open timgreen opened this issue 1 year ago • 16 comments

use piper as default.

issue #16

timgreen avatar Nov 18 '23 11:11 timgreen

Wow. Neat work. Will test and work on merge after this weekend. Thanks for the contribution.

p0n1 avatar Nov 18 '23 15:11 p0n1

I'm experiencing some issues with installing Piper on Mac, and I'm still working on resolving them.

p0n1 avatar Nov 21 '23 04:11 p0n1

@p0n1, I included another example in the README for https://github.com/coqui-ai/TTS, maybe you could try that instead.

timgreen avatar Nov 25 '23 11:11 timgreen

Hi @timgreen. I didn't expect that it could support both Coqui and Piper at the same time. That's really cool. I originally planned to test the use of Piper in a Linux environment, without considering the installation issues on Mac. However, my work has been busy lately, so I have to postpone it.

In addition, there are new contributors joining recently. @Bryksin has done a great job and has comprehensively refactored the code to facilitate the future integration of more TTS engines. You can find some discussions here: https://github.com/p0n1/epub_to_audiobook/issues/21#issuecomment-1824987948.

I apologize again for not being able to merge your code in a timely manner. Perhaps it would be better if you could contribute based on the refactored code once it got merged. Or I can help if you don't have the time.

Beside, we have a discord server now for syncing up and any discussion. Feel free to join. You could find the invite url here https://github.com/p0n1/epub_to_audiobook/issues/15#issuecomment-1825854839.

p0n1 avatar Nov 25 '23 15:11 p0n1

@timgreen I've attempted to build your branch to see if I could get the piper TTS working as I find it far better than Edge_TTS, however when running the command to use --tts local I get the following error

epub_to_audiobook.py: error: argument --tts: invalid choice: 'local' (choose from 'azure', 'openai')

Apologizes if this isn't the correct place to reach out with a question on this.

EnderSyth avatar May 01 '24 05:05 EnderSyth

@EnderSyth, this PR hasn't been merged yet. So you will need to try from my branch: https://github.com/timgreen/epub_to_audiobook/tree/local_tts

timgreen avatar May 01 '24 22:05 timgreen

@EnderSyth, this PR hasn't been merged yet. So you will need to try from my branch: https://github.com/timgreen/epub_to_audiobook/tree/local_tts

That is the one I cloned

git clone https://github.com/timgreen/epub_to_audiobook.git

It executes via 'epub_to_audiobook.py' which I believe is unique to your branch.

After following the normal build process I used the code example on that Repo

`python3 epub_to_audiobook.py "path/to/book.epub" "path/to/output/folder" --tts local'

But modified to the following for my environment. python .\epub_to_audiobook.py .\Saved_by_certain.epub .\Test\ --tts local

usage: epub_to_audiobook.py [-h] [--tts {azure,openai}] [--log LOG] [--preview] [--language LANGUAGE] [--newline_mode {single,double}] [--chapter_start CHAPTER_START] [--chapter_end CHAPTER_END] [--output_text] [--remove_endnotes] [--voice_name VOICE_NAME] [--break_duration BREAK_DURATION] [--output_format OUTPUT_FORMAT] [--openai_model OPENAI_MODEL] [--openai_voice OPENAI_VOICE] [--openai_format OPENAI_FORMAT] input_file output_folder epub_to_audiobook.py: error: argument --tts: invalid choice: 'local' (choose from 'azure', 'openai')

EnderSyth avatar May 01 '24 23:05 EnderSyth

Timgreen, thanks for your work on this. Whilst it hasn't been murged yet unfortunately, I've been able to test this branch locally on Linux and it's working well for me so far. Local tts is something I've been waiting on with this utility and piper is a good choice in my book.

danielw97 avatar May 01 '24 23:05 danielw97

EnderSyth , make sure to checkout the local_tts branch after cloning the repo

danielw97 avatar May 01 '24 23:05 danielw97

EnderSyth , make sure to checkout the local_tts branch after cloning the repo

Why thank you, I'm new to this so I missed that bit. After doing that indeed I can run it, though now its giving me issues with piper not being a recognized command. I'm trying to figure out how to get that installed but appear to not be able to get pip install piper-tts working due to missing piper-phonemize which also can't be found. But at least I'm one step further thank you.

EnderSyth avatar May 01 '24 23:05 EnderSyth

no problem at all, happy to help. Are you running python 3.12 by any chance? I tried to build this on Ubuntu 24.04 and ran into a similar issue, although setting up a venv (virtual environment with python 3.11) fixed it. Hope this helps.

danielw97 avatar May 01 '24 23:05 danielw97

I'm doing this under WSL on Ubuntu 20.04.6 LTS. I get Python 3.10.11 as the output running python -V.

I saw many posts when looking into the errors about different versions of python causing issues but 3.10 was supposed to be good from what I read.

EnderSyth avatar May 02 '24 00:05 EnderSyth

no problem at all, happy to help. Are you running python 3.12 by any chance? I tried to build this on Ubuntu 24.04 and ran into a similar issue, although setting up a venv (virtual environment with python 3.11) fixed it. Hope this helps.

Well I bite the bullet deleted the venv and decided to redo it with python 3.11, but thanks to WSL I couldn't install Python 3.11...long story short Bing GPT was able to guide me through compiling my own Python 3.11 and somehow it works!

Now I just have to figure out how to change to libritts_r properly. I think I need the --voice_name tag so going to play with that next.

EnderSyth avatar May 02 '24 01:05 EnderSyth

So I didn't see this and just opened a pull request specifically for piper.

See #77

On the one hand it's less generic on the other hand it "maps" parameters like silence and speed directly into piper parameters.

Any suggestions welcome.

vcalv avatar Jul 24 '24 02:07 vcalv

This PR is out of date and seems like implemented before major refactoring and project restructurisation. I will close it soon if it will not be updated

Bryksin avatar Aug 24 '24 22:08 Bryksin

I think this local_tts feature is quite flexible. I might adapt it to the latest code when I'm available.

p0n1 avatar Sep 05 '24 06:09 p0n1