flecs icon indicating copy to clipboard operation
flecs copied to clipboard

[Feature Request]: Romanized Lyrics using Goolgle Translate API

Open AMZMA opened this issue 7 months ago • 6 comments

Preflight Checklist

  • [x] I use the latest version of YouTube Music (Application).
  • [x] I have searched the issue tracker for a feature request that matches the one I want to file, without success.

Problem Description

using google translate to Romanized/Pronunciation/Transliteration the lyrics for any non roman text language.

Image

Proposed Solution

for reference: https://github.com/ssut/py-googletrans https://github.com/therealbush/translator

Alternatives Considered

thank you.

Additional Information

No response

AMZMA avatar May 06 '25 14:05 AMZMA

py-googletrans looks promising

I've already tried various Google translate endpoints in the past w/o success, will give this a look later

ArjixWasTaken avatar May 06 '25 14:05 ArjixWasTaken

I've been doing this manually using Google Translate with MacroDroid on my phone. I just copy all the lyrics (from LrcLib or Kugou) along with their timecodes, then paste them into Google Translate via browser. I replace the newline and space characters beforehand using regex.

I'm not sure if this is helpful, though.

AMZMA avatar May 06 '25 15:05 AMZMA

py-googletrans somewhat works, although translation.pronunciation is always None, but in translation.extra_data I can see the romaji

input: 獣は砂を一握り撒いた output:

{
  "translation": [
    [
      "The beast spread a handful of sand",
      "獣は砂を一握り撒いた",
      null,
      null,
      3,
      null,
      null,
      [
        [
          null,
          "offline"
        ]
      ],
      [
        [
          [
            "b8e89a82d2c70a0f411a34d09528c227",
            "offline_launch_doc.md"
          ]
        ]
      ]
    ],
    [
      null,
      null,
      null,
      "Kemono wa suna o hitonigiri maita"
    ]
  ],
  "all-translations": null,
  "original-language": "ja",
  "possible-translations": [
    [
      "獣は砂を一握り撒いた",
      null,
      [
        [
          "The beast spread a handful of sand",
          null,
          true,
          false,
          [
            3
          ],
          null,
          [
            [
              3
            ]
          ]
        ]
      ],
      [
        [
          0,
          10
        ]
      ],
      "獣は砂を一握り撒いた",
      0,
      0
    ]
  ],
  "confidence": 1,
  "possible-mistakes": null,
  "language": [
    [
      "ja"
    ],
    null,
    [
      1
    ],
    [
      "ja"
    ]
  ],
  "synonyms": null,
  "definitions": null,
  "examples": null,
  "see-also": null
}

ArjixWasTaken avatar May 06 '25 19:05 ArjixWasTaken

What remains to be figured out is, how can we reliably get the romanization from that json? Is the json always consistent? Why is py-googletrans not detecting the pronunciation?

ArjixWasTaken avatar May 06 '25 19:05 ArjixWasTaken

https://github.com/fast4x/RiMusic/ have been doing this, no problem so far.

https://github.com/fast4x/RiMusic/blob/master/composeApp/src/androidMain/kotlin/it/fast4x/rimusic/ui/screens/player/Lyrics.kt begin at line 399.

AMZMA avatar May 07 '25 11:05 AMZMA

👀

  • sl=auto: source language
  • client: idk, but it's required
  • dt=rm: transliteration
  • dj=1: nice json key names
curl \
  -G 'https://translate.google.com/translate_a/single' \
  -d 'sl=auto' \
  -d 'client=gtx' \
  -d 'dt=rm' \
  -d 'dj=1' \
  --data-urlencode "q=獣は砂を一握り撒いた"
{
  "sentences": [
    {
      "src_translit": "Kemono wa suna o hitonigiri maita"
    }
  ],
  "src": "ja",
  "confidence": 1.0,
  "spell": {},
  "ld_result": {
    "srclangs": [
      "ja"
    ],
    "srclangs_confidences": [
      1.0
    ],
    "extended_srclangs": [
      "ja"
    ]
  }
}

h-banii avatar May 09 '25 22:05 h-banii