Jan Lasek issues

Results 11 issues of


                                            Jan Lasek

Copy sparse, dense and labels from the right source when shuffling Criteo data

When dense and spare features are located in different directories, i.e., `input_dir_labels_and_dense != input_dir_sparse` as the input to `shuffle()` function, this method throws an error in [L642](https://github.com/pytorch/torchrec/blob/71d51c8764c141ff8d849f73bcf06548ffad36c4/torchrec/datasets/criteo.py#L642) like: ``` FileNotFoundError:...

CLA Signed

AMMO Integration with Llama2 Post-Training Quantization Example and Tests

# What does this PR do ? Integrating AMMO library to the project and providing utilities for quantizing models with Llama2 PTQ example. Different quantization algorithms are available including **INT8...

NLP

Account for mpirun use case in get_rank

# What does this PR do ? Some distributed workloads can possibly be launched "manually" with `mpirun -n N python ...`. I'm adapting `nemo.utils.get_rank.py` module to take it into account...

Create causal mask on MCore/TE side

# What does this PR do ? Currently Megatron-LM creates causal attention mask for a local-layer spec model on its own -- see https://github.com/NVIDIA/Megatron-LM/commit/a45805a3ee0645b85b48d14b0a8077fa5b1216b2 -- and hence `attention_mask=None` can be...

NLP

Unused imports cleanup

# What does this PR do ? Unused imports cleanup indentified by [Flake8](https://flake8.pycqa.org/en/latest), see https://www.flake8rules.com/rules/F401.html. **Collection**: [ALL] # Jenkins CI To run Jenkins, a NeMo User with write access must...

core

TTS

ASR

NLP

common

Restore PTQ tests for Llama2

# What does this PR do ? Restoring PTQ tests for Llama2 model. The tests are moved here from recently disabled [Jenkinsfile](https://github.com/NVIDIA/NeMo/blob/main/Jenkinsfile). Testing all methods: FP8, INT8 SQ, INT4 AWQ...

Extend Nemo AutoTokenizer & SentencePieceTokenizer API for TensorRT-LLM & AMMO evaluation scripts usage

# What does this PR do ? Extending NeMo's AutoTokenizer and SentencePieceTokenizer for consistency with HuggingFace tokenizers API as it is used throughout TensorRT-LLM and AMMO in evaluation tools. This...

stale

common

Update to using Model Optimizer (formerly AMMO) in PTQ workflow

# What does this PR do ? Summary: * use Model Optimizer library (formerly AMMO) for LLM PTQ workflow * restore PTQ tests failing due to setting `apply_rope_fusion = True`...

NLP

Update nemo.export module for quantized models

# What does this PR do ? Fix TRT-LLM engine export issues for quantized "qnemo" models. This is a cherry-pick mirror of recent changes introduced in the main branch in...

NLP

Run CICD

Export & deploy updates (part I)

# What does this PR do ? Extra changes just for cleanup purposes related to https://github.com/NVIDIA/NeMo/pull/10904, isolated here to facilitate review for main functionalities. **Collection**: NLP # Changelog - Add...