NeMo issues

Train LM with a large file

1

I try to train ~5.6G data with ~700M validation Using this below command: python /workspace/data/NeMo/examples/nlp/language_modeling/bert_pretraining.py --config-name=/workspace/data/NeMo/examples/nlp/language_modeling/conf/bert_pretraining_from_text_config.yaml model.train_ds.data_file="/workspace/data/NeMo/lm/data/public-data/train.txt" model.validation_ds.data_file="/workspace/data/NeMo/lm/data/public-data/val.txt" model.train_ds.batch_size=128 model.optim.lr=5e-5 trainer.max_epochs=1 trainer.gpus=1 Then it show this error GPU available: True,...

Phakkhamat

bug

[WIP] Multilingual punctuation restoration, true casing, and sentence boundary detection

7

# What does this PR do ? This is a work-in-progress for a model and data set that performs multilingual punctuation restoration, true casing, and sentence boundary detection. See `Usage`...

1-800-BAD-CODE

Wav2vec2.0 Script add

10

# What does this PR do ? Adds a script to load Wav2Vec2.0 weights from Fair to NeMo implementation. Also adjusts NeMo implementation to be similar to Fair's. **Collection**: [ASR]...

tbartley94

[Add] SLURP models and examples

18

The PR adds the new SOTA model we have for intent classification and slot filling for spoken language understanding. - [x] Move transformer modules from `nemo.collections.nlp.modules.common` to `nemo.collections.common.parts`, see also...

stevehuang52

nemo2riva: `AttributeError: 'EncDecRNNTBPEModel' object has no attribute 'input_example'`

3

**Describe the bug** I'm trying to convert .nemo to .riva, but I'm getting message `AttributeError: 'EncDecRNNTBPEModel' object has no attribute 'input_example`. Converting the Conformer-CTC model is OK but the Conformer-Transducer...

lifefeel

bug

Need some info on spanish asr ctc confirmer large model.

4

Can I get more data on the dataset cleaning process? I am a non spanish speaker 😅. Like voxpoplui datasets says 120hrs after cleaning. I downloaded the dataset it has...

evilc3

Converting biomegatron to onnx

**Describe the bug** Hi, I am trying to convert Biomegatron Model finetuned for token classification tasks using nemo to onnx format. I am getting the following error: ARNING: The shape...

ayush-raj7

bug

[Fix] schedulers with no max_steps param

3

All LR schedulers in PyTorch do not have the `max_steps` parameter, so we should not add `max_steps` to their `scheduler_args`. Previous code tackle the problem in case-by-case manner, while here...

stevehuang52

Zero Shot Slot Filling Model

16

Signed-off-by: Lily Lee # What does this PR do ? Zero Shot Slot Filling Model **Collection**: NLP # Changelog - Add specific line by line info of high level changes...

finalily

Speaker Diarization with Marblenet and ClusterDiarizer issue

3

**Describe the bug** I am not getting correct time stamps for speech segment and many speech chunks are removed. I am using pretrained Marblenet and speakerdiarization_speakernet models. It removes lots...

ddebnath228

question

NeMo
NeMo copied to clipboard

Metadata

Train LM with a large file

[WIP] Multilingual punctuation restoration, true casing, and sentence boundary detection

Wav2vec2.0 Script add

[Add] SLURP models and examples

nemo2riva: `AttributeError: 'EncDecRNNTBPEModel' object has no attribute 'input_example'`

Need some info on spanish asr ctc confirmer large model.

Converting biomegatron to onnx

[Fix] schedulers with no max_steps param

Zero Shot Slot Filling Model

Speaker Diarization with Marblenet and ClusterDiarizer issue

← Metadata

Owner

Metadata

NeMo NeMo copied to clipboard

Metadata

← Metadata

Owner

Metadata

NeMo
NeMo copied to clipboard