NeMo
NeMo copied to clipboard
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
**Is your feature request related to a problem? Please describe.** normalize() has a `500` word limit per here: https://github.com/NVIDIA/NeMo/blob/main/nemo_text_processing/text_normalization/normalize.py#L255 This is not documented in the documentation pages as far as...
**Describe the bug** Installing `nemo_toolkit[nemo_text_processing]` does not bring in `pytorch_lightning` but then ```python from nemo_text_processing.text_normalization import normalize ``` errors as pytorch_lightning is required to import **Steps/Code to reproduce bug** 1....
I am using TTS using the below code. I want to run TTS offline on an Android device as a next step using the same pretrained models. ```python import soundfile...
@PeganovAnton Training on multiple GPUs I'm noticing that `train_loss` is decreasing, and `f1` scores are increasing, but so is `val_loss`. Is `val_loss` the right metric to be monitored? Would mean...
Regardless of the model being exported the script fails with ```python3 File "scripts/export.py", line 160, in nemo_export raise e File "scripts/export.py", line 139, in nemo_export output_example = forward_method(model)(*input_list, **input_dict) File...
I tried to export a FastPitch model I trained to ONNX with [export.py](https://github.com/NVIDIA/NeMo/blob/main/scripts/export.py). The command I used was: `python scripts/export.py "/home/xxx/TTSs/Nemo-models/nancy_fastpitch-44k-new-v3.nemo" "nancy_fastpitch-44k-new-v3.onnx" --runtime-check --device="cpu" --autocast` But it produced an error:...
**Describe the bug** After a few steps when pretraining a `SpeechEncDecSelfSupervisedModel`, training fails with the following error ```sh File "/opt/conda/lib/python3.8/site-packages/nemo/collections/asr/models/ssl_models.py", line 468, in training_step loss_value, loss_val_dict = self.decoder_loss_step( File "/opt/conda/lib/python3.8/site-packages/nemo/collections/asr/models/ssl_models.py",...
I am performing VAD using examples/asr/vad_infer.py . I see that, from the code, MarbleNet directly operates on raw audio data using convolution filters. There is no separate feature extraction from...
Hello, Can you guide me how to train and fine tune the Speaker Diarization model taught in https://github.com/NVIDIA/NeMo/blob/main/tutorials/speaker_tasks/Speaker_Diarization_Inference.ipynb. I was unable to find any documentation on this Thanks
Hi @nithinraok , Please have a look into it. I am trying to finetune speaker recognition model on open source dataset. When I set train_manifest_path and dev_manifest_path both as same...