Missing phonemes when converting from MusicXML to DS
Hi, what is the best way to convert a MusicXML file to a Diffsinger script? When I try this I'm getting the rror: ParamsError: The source file lacks phoneme parameters. I basically understand what this means, MusicXML doesn't provide Phoneme's natively. So, how do I do that then? I thought opencpop-extension would specify those, so they could be used? Thank you for any hint you have on this.
The source file needs to store a list of phonemes with their exact durations in the note attribute. Currently only a few project file formats have implemented this feature, such as acep, svip, tlp, etc.
Thank you for the explanation. However, I'm still not sure how to solve the problem(s) I currently have, maybe you have an idea. My workflow right now is converting MusicXML to USTX and then using Diffsinger through OpenUTAU. However OpenUtau separates syllables a bit strange, so whole words are not processed, leading to wrong pronunciations in many cases. Plus, connected notes won't sing as they should, instead the lyric la is always inserted. I had hoped to solve these, by converting directly to Diffsinger, which should be able to process whole words and connected notes. But this doesn't look like it's working - even if I convert to a format that supports those parameters in between, like SVIP or USTX, I can't continue directly to DS withouth previously running the software, to get the phonemizer add the flags. Do you have any idea what I could try to improve that? Thank you very much in advance if you have any advice, and for all your effort. I'll make a PR, as soon as I have the German translation ready.
musicxml represents a printed sheet music for human reader, while ustx represents an input to machine singing synthesis program. That's why they handle lyrics differently. We should have a smarter lyric conversion logic.
For multisyllable lyrics: In OpenUtau, we have to input the whole word in the first note, use + to distribute syllable and use +~ to extend the current syllable, which is different from a printed sheet music.
For "connected notes" (in OpenUtau we call them slur notes), they are represented in sheet music as two notes connected with a slur symbol:
In OpenUtau, input lyrics on the first note, and input +~ on the second note.
@oxygen-dioxide Thank you for pointing that out - maybe I should give some more context why I created this issue: I'm blind, and use a a screenreader, therefore the images you sent don't really help me. I'm familiar with sheet music and work with Musescore a lot, but new to SVS. Since OpenUTAU isn't really accessible, I wouldn't use it if I didn't have to, it's for me nothing more than a middleware between MusicXML and Diffsinger. Because of this, I was hoping to be able to convert MusicXML directly to DS, so asked this. But as SoulMelody wrote, this is because of the missing parameters not possible. Since I neither know if Diffscope's already finished, nor does LibreSVIP support the dsp format, it looks like I still need to keep with OpenUTAU for my Diffsinger creations. I have very limited access to it, selecting a singer and Phonemizer works, but I can forget about the editor. That's why your hint with +~ for slur notes was already very valuable. LibreSVIP doesn't insert it, instead it inserts la. However I tried it, and now after each command I run:
sed -i 's/\blyric: la\b/lyric: +~/g' file.ustx
That fixes the slurs - however since this was not very intuitive, maybe LibreSVIP could do that in the future automatically. As for words with more than one syllable: The info that a + is used is already good to know, but having to go through the whole file and insert it everywhere inside of the words would need much time, as this needs a manual process. If the lyric handling could get smarter, it could save a lot of time. Thank you again, +~ already was very helpful, and maybe we can simplify this process in the future more. Have a mery christmas!