GigaAM
GigaAM copied to clipboard

Published 20 hours ago •

salute-developers

→

Metadata

Foundational Model for Speech Recognition Tasks

Reame
Issues

Results 11 GigaAM issues

Sort by recently updated

what is the ssl model for and how to use it.

can someone show me a schematic of how this neural network works?

strangely, many empty transcriptions on mozilla common voice

Во вложении пример файла, на который ctc-инференс стабильно возвращает пустую транскрипцию, проверено на двух разных машинах с разными видеокартами. [common_voice_ru_35728771.zip](https://github.com/salute-developers/GigaAM/files/15138667/common_voice_ru_35728771.zip) Это из набора mozilla common voice (cv-corpus-12.0-delta-2022-12-07) и на нем...

CUDA out of memory

2

comment

Прекрасно справилось с маленьким файлом (60Кб), но возникла проблема при распознавании речи в файле размером 2.1Мб: ``` (venv) root@dk04:~/GigaAM# file /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav /mnt/rec/0b5ef5be-0925-4462-8f4e-cecab7f4d572.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16...

Update README.md

added packages required for installation

Train from scratch on the datasets used for finetune

Hi! Curious, do you provide baselines/checkpoints where you train from scratch on Golos+Sova+RCV+RLS including some models like FastConformer (hybrid CTC+RNNT)? It would be helpful repro baselines, given that nvidia does...

Transcribe stream from microphone

Доброго дня. Попробовал вашу модель и вполне был приятно доволен результатом. Есть какая-то возможность использовать вашу модель при работе с микрофоном?

ONNX

1

comment

How can I launch GigaAM_RNNT using ONNX? sherpa-onnx crashes colab( Please, help me with this

attribute 'asr' not found in nemo.collections

This error appears after installing with docker or virtual environment as described in examples README.md: AttributeError: module 'nemo.collections' has no attribute 'asr'

transcript with timestamp

Added an example of how to output word timestamps

GigaAM C++ example

Is it possible to use your solution on cpp Torch (libtorch)?

1
2
›

About

Foundational Model for Speech Recognition Tasks

emotion-recognition

speech-recognition

self-supervised-learning

foundation-models

84

Stars

3

Forks

Watchers

Owner

salute-developers

← Metadata

84

Stars

3

Forks

Watchers

Owner

salute-developers

Metadata

Foundational Model for Speech Recognition Tasks