mimic1 icon indicating copy to clipboard operation
mimic1 copied to clipboard

Spanish Language

Open ryanleesipes opened this issue 9 years ago • 17 comments

Need to develop the ability to support the Spanish language.

ryanleesipes avatar Feb 10 '16 15:02 ryanleesipes

Hi, I'm also interested in this. Could you find something recently ?. I am a software developer and I want to contribute to the project

Hola, tambien estoy interesado en esto. ¿Pudiste encontrar algo recientemente?. Soy desarrollador de software y quiero contribuir en el proyecto

javiercani avatar Mar 12 '16 20:03 javiercani

@javiercani any help is appreciated. If you feel up to the challenge adding more languages is a good research area, there are some resources around the web for getting started (such as http://homepages.inf.ed.ac.uk/jyamagis/software/page54/page54.html).

There are currently work updating the scripts for building pronunciation lexicons (Pull request #17) and once they are fully working it might be something to build upon for more languages.

forslund avatar Mar 12 '16 21:03 forslund

OK @forslund , i will read more and i will come back later. Thanks

javiercani avatar Mar 12 '16 21:03 javiercani

Spain Spanish and Latin American Spanish are different, supporting both would be great. Keep up the good work!

AnderRasoVazquez avatar Jul 17 '16 12:07 AnderRasoVazquez

I've just installed mimic and played with the english voices. Is there any improvement on spanish or perhaps some guidances to help to improve?

adocampo avatar Oct 10 '16 23:10 adocampo

@zeehio is working on spanish support and has made some significant progress in the architecture. Last I heard he was looking into phonetic dictionaries.

@zeehio, are there any suitable tasks for contributors that can be split from the main task?

forslund avatar Oct 14 '16 15:10 forslund

@malevolent @zeehio outlined some improvements that might be good to work towards in #86 when that PR is merged.

forslund avatar Oct 18 '16 14:10 forslund

Perfect! If there is something I can do, I would like to contribute... if there is guidance it surely will help

adocampo avatar Oct 18 '16 15:10 adocampo

@malevolent @zeehio has begun the wiki entry on the issue. Other than that it's the old flite documentation. I try to keep an eye on the mimic channel at the mycroft slack so if you've got questions I can try to answer them

forslund avatar Oct 18 '16 18:10 forslund

Well, I'll wait for the wiki to help. I keep an eye here .

adocampo avatar Oct 18 '16 19:10 adocampo

Hello, question: this thread has been open since march '16, is there any update on supporting spanish language? Seems to not be ready yet, no documentation and a couple of issues are still open here.

jenavarro avatar Jan 21 '17 15:01 jenavarro

The thread opened right after the work was started on mimic and work slowly moving forward. @zeehio has done a couple of large chunks of work, updated the model-builder script, added utf-8 support and has a pending PR for a tokenizer for Spanish. In addition to this he's working on a phonetic dictionary. Unfortunately he seem to have less time for mimic work these days and I totally understand him.

Any help is appreciated, there are pending PR's that no one has had the opportunity to review. Directly relating to this issue is PR #86. I think there's a possible memory leak, but I might be wrong and a set of extra eyes (and a brain sharper than mine behind them) to confirm would be good.

forslund avatar Jan 21 '17 16:01 forslund

Hi,

Unfortunately I am trapped under a lot of PhD related work. I don't think I will be able to commit time to mimic in the following months, but I may assist if someone else wants to do the work.

I have done the fixes suggested by @forslund (hi... thanks for your understanding... I wish I could work more on this...) to the last PR I submitted some months ago. Once that is merged someone with Spanish knowledge could work on improving the token to words rules (See the code in #86 and write here if you are still interested (@javiercani @adocampo)).

Converting words (that come out of the tokenizer) to phonemes can be done through the saga library. This is more memory efficient than building a whole lexicon for each Spanish dialect and covers several dialects.

More pieces are needed but our only choice is to work little by little (as our time allows us) on each of them.

zeehio avatar Feb 20 '17 11:02 zeehio

Hello @zeehio

I would like to help you with the code me too. I will have some month free, so something I think I can do.

albertosgz avatar Feb 21 '17 02:02 albertosgz

Spanish HTS voice for Festival: http://homepages.inf.ed.ac.uk/jyamagis/software/page54/page54.html

zeehio avatar Mar 27 '17 18:03 zeehio

Mycroft may, but mimic still hasn't got it (except for my own very poor attempt and zeehio's top secret state of the art technology)

forslund avatar May 15 '17 18:05 forslund

Since my PR would use a GPL library for the phonetic transcription it would need to change mimic license to GPL... While some cleaning may be needed in the code this is the major blocking issue for Spanish support right now. Pinging @penrods to know how things are in that end

Oh, and the Spanish voice I have was the state of the art some years ago... It has room for improvement... but it's a start! :-)

zeehio avatar May 15 '17 19:05 zeehio