bookworm icon indicating copy to clipboard operation
bookworm copied to clipboard

Ability to listen to audiobooks

Open DraganRatkovich opened this issue 2 years ago • 16 comments

Is your feature request related to a problem? Please describe.

No

Describe the solution you'd like

I wonder what the community thinks to add the ability to listen to an audiobook in this program? Well, I know that you can listen to an audiobook in any media player, but listening to an audiobook with the ability to save bookmarks to quickly jump to the right section, make notes, etc., would probably be amazing. Opinions of others are very welcome

Describe alternatives you've considered

Additional context

It would be great to hear from the developers @mush42 @MichelSuch @cary-rowen @pauliyobo about this feature.

DraganRatkovich avatar Feb 13 '22 08:02 DraganRatkovich

Hello @DraganRatkovich

The first question that pops into mind is whether there is any standard format for audio books. If so, we can contemplate adding support for it the same way we have support for other document types.

Technically speaking, it is very easy to support playing audio in Bookworm. But this is out of scope for an eBook reader.

Another related feature is support for EPUB 3 media overlays. But this is hampered by the lack of adoption for such feature by publishers.

Best

Musharraf

mush42 avatar Feb 13 '22 13:02 mush42

Good day @mush42 No, there are no special standards for audiobooks, unless the bookseller wants to encrypt them and convert them to another format, such as .lkf. Currently, the most popular audiobook formats are standard media files, .mp3 and .wav, and to my knowledge, Python can easily open these formats. The biggest reason for requesting this feature was the amazing abilities of the Bookworm like opening a wide range of formats like .pdf, .doc and many more and I was really shocked when Bookworm opened a 1000 page document in a matter of seconds and still not sure what you guys did there, but you are amazing and for sure adobe acrobat is removed from my system for this reason🙂 and of course when I scanned with your new OCR document in georgian language which was images combined in .pdf format and bookworm rendered this page amazingly, my shock increased, because guys this program is the first PC software that can read Georgian scanned PDF files in an accessible way, so thanks for that. Now, in terms of the audiobook format, if this feature is added to your program, Bookworm will become the number one program in the blind community, at least in Georgia, which can be imagined as a library in a computer that will help people read. scan or listen to audiobooks in a convenient way.

DraganRatkovich avatar Feb 13 '22 14:02 DraganRatkovich

@mush42, we can support them the same way Voice Dream Reader does, I think. Just take in either an MP3/wav file, or a zip file of audio files that get turned into their own chapters.

TheQuinbox avatar Feb 16 '22 16:02 TheQuinbox

My only concern is how would be able to keep this consistent? If we allowed any MP3/WAV file to be opened within bookworm, it would basically become a media player. Is that what @DraganRatkovich is essentially proposing? Also, I believe another problem is that I am not sure how we would be able to easily represent the book. Using zip files could be an idea, but what if the entire book is in an unique file? How do we make sure that what is being opened is an audiobook and not, say, a simple audio clip? If there's anything I should be aware of that I did not get I would be more than happy to receive pointers to it.

pauliyobo avatar Feb 16 '22 16:02 pauliyobo

@TheQuinbox Should we enable the same tools we have for textual documents? I.e. bookmarking, note-taking ..etc

If yes, how would that look like?

mush42 avatar Feb 16 '22 16:02 mush42

@mush42, I'm not sure exactly how the bookmarking code works, but I'd assume that it stores an index in the text? We could store the position that the user was at in the audio file.

TheQuinbox avatar Feb 16 '22 20:02 TheQuinbox

@TheQuinbox Yes, that's exactly how it is done.

Still the issues raised by @pauliyobo need to be resolved before we embark on this.

Best

mush42 avatar Feb 16 '22 22:02 mush42

Good day to all, Sorry for the late reply. I have seen all the comments left on this feature. To be honest, I don't know how it can be done to prevent listening to other media files with Bookworm and only listening to audiobooks, because as @pauliyobo said, music files can be easily opened with Bookworm, but if there are any ways to prevent this , not sure. On the other hand, there are special programs that allow you to listen to audiobooks, but do not interfere with the playback of other media files, so, simply put, listening should depend on the user whether he wants to listen to audiobook or music file by simply opening it like an audiobook. If you have any ideas or know other ways, I'm very happy to discuss, as I'm also very interested.

DraganRatkovich avatar Feb 19 '22 12:02 DraganRatkovich

What about converting the audiobook to a text book first then process it like usual?

This isn't as simple as just sending audio to a speech to text engine, but I think it would be the most straightforward way to accomplish the given task using the all the features readily available now.

This is theoretically similar to performing OCR in image in order to make it text, but in practice it would depend on how the audiobook defined it's sections, how it reads it's titles, etc. Some of them have little soundtracks to acknowledge end of sections, others use the narrator voice, etc.

iuriguilherme avatar Apr 02 '22 01:04 iuriguilherme

Hello @iuriguilherme

Could you give more technical details on what exactly do you mean by converting audiobook to text book format? I can understand the Daisy format which is basically .mp3 files with extra html and stuff to handle sections, chapters etc properly, but I can't understand correctly how it is possible to convert the mp3 file recorded in the studio to a text file. I know of several encryption methods for preserving the digital rights of audiobooks, but they are completely different things.

DraganRatkovich avatar Apr 02 '22 06:04 DraganRatkovich

Could you give more technical details on what exactly do you mean by converting audiobook to text book format?

Through the use of a speech-to-text engine (STT).

EDIT: an example using google stt: https://github.com/googleapis/python-speech/blob/main/samples/snippets/quickstart.py

iuriguilherme avatar Apr 02 '22 07:04 iuriguilherme

@iuriguilherme Then what's the point of integrating the audiobook listening feature into Bookworm if it needs to be converted into a textbook?

Or if you mean text-to-speech technology instead of STT? Do you have examples of the types of books mentioned?

DraganRatkovich avatar Apr 02 '22 07:04 DraganRatkovich

No, I mean exactly converting speech to text. Because that is what allows all the processing BEFORE you use a text to speech (TTS) to interact with the user.

I use this approach with hearing/speaking robots. In fact, every talking robot you see out there is essentially a chatbot which converts what it hears to text, process it using neural models then convert the text reply to speech so the robot can answer.

iuriguilherme avatar Apr 02 '22 07:04 iuriguilherme

@iuriguilherme Sounds really interesting. Perhaps the lead developer might consider this as I don't really have any knowledge of programming or other development.

DraganRatkovich avatar Apr 02 '22 07:04 DraganRatkovich

I don't have good knowledge of how to programatically use Cortana (the windows builtin STT engine), I only know the cloud api services. But the logic is what I described.

iuriguilherme avatar Apr 02 '22 07:04 iuriguilherme

In terms of a standard audiobook format, M4b comes to mind. It is at least something that I would like to see if this feature gets added, given that the format has chapter tags built in that most modern media players support.

Lucas18503 avatar Apr 16 '22 02:04 Lucas18503