chatterbox icon indicating copy to clipboard operation
chatterbox copied to clipboard

Greek language output is incorrect / sounds mixed with other languages

Open 4ever-AI opened this issue 3 months ago • 7 comments

Hello Resemble team 👋

First of all, thank you for releasing Chatterbox Multilingual.

However, I noticed a serious issue with Greek (el) support:

The generated audio does not sound like proper Greek.

Pronunciation is inconsistent and often mixed with English or other languages.

Even simple words (e.g. "Καλημέρα", "Ευχαριστώ") are not pronounced correctly.

Using a Greek reference audio sample does not fix the issue.

It seems that Greek support might not have been trained sufficiently, or the phoneme mapping is incorrect.

I wish you all the best with the development.

4ever-AI avatar Sep 07 '25 12:09 4ever-AI

It seems there is a problem with accented Greek letters such as ά, έ, ή, ί, ό, and ώ. On the demo page, however, the provided example does not seem to have this issue https://resemble-ai.github.io/chatterbox_demopage/

ppc2017 avatar Sep 07 '25 21:09 ppc2017

It seems there is a problem with accented Greek letters such as ά, έ, ή, ί, ό, and ώ. On the demo page, however, the provided example does not seem to have this issue https://resemble-ai.github.io/chatterbox_demopage/

Thank you for your reply and for the time to check it. I checked the demo page and I confirm that the sample there is correct. However the Space demo is not working right. It's like a mix of many languages instead of just Greek.

4ever-AI avatar Sep 07 '25 22:09 4ever-AI

As a temporary workaround, you can remove the accents from the vowels. In this case, it will speak correctly, but obviously some words will be stressed incorrectly. Another issue is that it does not recognize the Greek question mark ( ; ), although it works fine with the English one.

ppc2017 avatar Sep 07 '25 22:09 ppc2017

As a temporary workaround, you can remove the accents from the vowels. In this case, it will speak correctly, but obviously some words will be stressed incorrectly. Another issue is that it does not recognize the Greek question mark ( ; ), although it works fine with the English one.

Thank you very much for the information. Indeed after trying it was a bit better. However almost all of the Greek words need accents to understand the language. Thanks again for sharing the tip. I hope they update the project.

4ever-AI avatar Sep 07 '25 23:09 4ever-AI

The workaround from the Polish thread can also be used with Greek. Here’s the code adapted for Greek. Save it as a text file with the .py extension (e.g., greek_decomposer.py), then run it with Python. When you run it, a GUI opens where you can type or paste Greek text. Press the Decompose button to decompose it, copy the decomposed text with the Ctrl+C shortcut, and then paste it into Chatterbox.

import tkinter as tk
from tkinter import scrolledtext
from tkinter import Menu

# Mapping of Greek accented and diacritic characters to decomposed forms
GREEK_DECOMPOSE_MAP = {
    # Modern Greek (monotonic) with tonos
    "ά": "α\u0301",  # U+03AC → U+03B1 (alpha) + U+0301 (combining acute)
    "έ": "ε\u0301",  # U+03AD → U+03B5 (epsilon) + U+0301
    "ή": "η\u0301",  # U+03AE → U+03B7 (eta) + U+0301
    "ί": "ι\u0301",  # U+03AF → U+03B9 (iota) + U+0301
    "ό": "ο\u0301",  # U+03CC → U+03BF (omicron) + U+0301
    "ύ": "υ\u0301",  # U+03CD → U+03C5 (upsilon) + U+0301
    "ώ": "ω\u0301",  # U+03CE → U+03C9 (omega) + U+0301
    "Ά": "Α\u0301",  # U+0386 → U+0391 (capital alpha) + U+0301
    "Έ": "Ε\u0301",  # U+0388 → U+0395 (capital epsilon) + U+0301
    "Ή": "Η\u0301",  # U+0389 → U+0397 (capital eta) + U+0301
    "Ί": "Ι\u0301",  # U+038A → U+0399 (capital iota) + U+0301
    "Ό": "Ο\u0301",  # U+038C → U+039F (capital omicron) + U+0301
    "Ύ": "Υ\u0301",  # U+038E → U+03A5 (capital upsilon) + U+0301
    "Ώ": "Ω\u0301",  # U+038F → U+03A9 (capital omega) + U+0301
    # Modern Greek with dialytika
    "ϊ": "ι\u0308",  # U+03CA → U+03B9 (iota) + U+0308 (combining dialytika)
    "ϋ": "υ\u0308",  # U+03CB → U+03C5 (upsilon) + U+0308
    "Ϊ": "Ι\u0308",  # U+03AA → U+0399 (capital iota) + U+0308
    "Ϋ": "Υ\u0308",  # U+03AB → U+03A5 (capital upsilon) + U+0308
    # Modern Greek with dialytika and tonos
    "ΐ": "ι\u0308\u0301",  # U+0390 → U+03B9 (iota) + U+0308 + U+0301
    "ΰ": "υ\u0308\u0301",  # U+03B0 → U+03C5 (upsilon) + U+0308 + U+0301
    # Polytonic Greek (optional, for completeness)
    "ἀ": "α\u0313",  # U+1F00 → U+03B1 (alpha) + U+0313 (combining smooth breathing)
    "ἁ": "α\u0314",  # U+1F01 → U+03B1 + U+0314 (combining rough breathing)
    "ἂ": "α\u0313\u0300",  # U+1F02 → U+03B1 + U+0313 + U+0300 (combining grave)
    "ἃ": "α\u0314\u0300",  # U+1F03 → U+03B1 + U+0314 + U+0300
    # Add other polytonic forms as needed (e.g., for η, ι, υ, etc.)
}

def decompose_greek_text(text):
    """Replace Greek accented and diacritic characters with decomposed forms."""
    return "".join(GREEK_DECOMPOSE_MAP.get(char, char) for char in text)

def on_decompose():
    """Handle the decompose button click."""
    input_text = input_box.get("1.0", tk.END).rstrip()
    decomposed_text = decompose_greek_text(input_text)
    output_box.delete("1.0", tk.END)
    output_box.insert(tk.END, decomposed_text)
    root.clipboard_clear()
    root.clipboard_append(decomposed_text)

def paste_text(event=None):
    """Paste text from clipboard into the input box."""
    try:
        input_box.delete("1.0", tk.END)
        input_box.insert(tk.END, root.clipboard_get())
    except tk.TclError:
        pass  # Handle empty or invalid clipboard
    return "break"  # Prevent default binding

def show_context_menu(event):
    """Show right-click context menu for paste."""
    context_menu.post(event.x_root, event.y_root)

# Setup GUI
root = tk.Tk()
root.title("Greek Decomposer for Chatterbox TTS")

# Input label and text box
tk.Label(root, text="Input Greek text:").pack(pady=(10, 0))
input_box = scrolledtext.ScrolledText(root, wrap=tk.WORD, width=60, height=10)
input_box.pack(padx=10, pady=5)

# Bind paste shortcut (Ctrl+V or Cmd+V)
input_box.bind("<Control-v>", paste_text)  # For Windows/Linux
input_box.bind("<Command-v>", paste_text)  # For macOS

# Create context menu for right-click
context_menu = Menu(root, tearoff=0)
context_menu.add_command(label="Paste", command=paste_text)
input_box.bind("<Button-3>", show_context_menu)  # Right-click for context menu

# Decompose button
decompose_button = tk.Button(root, text="DECOMPOSE", command=on_decompose)
decompose_button.pack(pady=5)

# Output label and text box
tk.Label(root, text="Decomposed output:").pack(pady=(10, 0))
output_box = scrolledtext.ScrolledText(root, wrap=tk.WORD, width=60, height=10)
output_box.pack(padx=10, pady=5)

root.mainloop()

ppc2017 avatar Sep 08 '25 00:09 ppc2017

That worked! Thank you very much! I confirm that now I can hear everything right. After comparing it with the VibeVoice I think that Chatterbox miss the exact original quality of the speaker voice and Vibevoice maintains it much better. Although Chatterbox is faster.

Thank you so much once more!

4ever-AI avatar Sep 08 '25 11:09 4ever-AI

I can confirm, as a native speaker, that @ppc2017 's decompose solution improves Greek audio quality by a lot.

JKolios avatar Oct 02 '25 15:10 JKolios