seamless_communication icon indicating copy to clipboard operation
seamless_communication copied to clipboard

Foundational Models for State-of-the-Art Speech and Text Translation

Results 226 seamless_communication issues
Sort by recently updated
recently updated
newest added

SeamlessExpressive evaluation of eng-cmn normalizes both transcribed and ground truth texts into simplified Chinese texts. It is observed that there is a mix of traditional and simplified Chinese characters in...

CLA Signed

Hello, I have noticed the following concerning the audio frames provided for retrieval; they seem to be slightly erroneous. For instance, the following metadata (enA-mtA and enA-mlt): - [enA-mtA metadata...

I'm trying to fintune the seamlessM4T_v2_large model on speech translation task. Would there be a reason for the model to return nan values? ` tokens, units = model(batch) ` **Output**...

Hi, I am trying to use SeamlessM4T_medium ckpt for evaluation, but I am getting following error while loading the ckpt. I just added `--model_name seamlessM4T_medium` to the command, is there...

I tried to finetune on a new language using m4t_cli scripts without success. I have the following error which I cannot understand. However, it is indicated in the dataloader that`...

I want to build a simple desktop app that translates a user's language in real-time. Using Blackhole, I'd like to stream the audio from say a Zoom call into the...

# What ? For bilingual model, running T2TT will give back a Result, in which only the `transcription` and `word_confidence_score` is interested. In such case, it is more convenient to...

CLA Signed

Hi all, Great work on Seamless! I am using parts of `seamless_communication` (in particular, some of the alignment models) in an industry engineering project, and rather than cloning the repository...

### Problem Description The dataset provided at [this link](https://github.com/facebookresearch/seamless_communication/blob/main/docs/m4t/seamless_align_README.md) presents challenges in extracting Maltese datasets. Specifically, the metadata for Textual Audio alignment includes a subset seemingly sourced from common-crawl, with...

Hello everyone, ## Issue Description ### Observation The Maltese dataset dated November 30, 2023, is strictly identical to the previous version, without any observable extension. The datasets metadata is provided...