NeMo-text-processing icon indicating copy to clipboard operation
NeMo-text-processing copied to clipboard

Enforce UTF-8 on loading all locale-specific dictionary labels

Open i-am-bbking opened this issue 6 months ago • 2 comments

  • Also added UTF-8 encoding on file reading elsewhere in the code
  • Related to https://github.com/NVIDIA/NeMo/pull/3520

What does this PR do ?

Fixes the bug detailed here. https://github.com/NVIDIA/NeMo/issues/13310

Before your PR is "Ready for review"

Pre checks:

  • [X] Have you signed your commits? Use git commit -s to sign.
  • [X] Do all unittests finish successfully before sending PR?
    1. pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
    2. Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
  • [x] If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
  • [X] Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
  • [X] Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
  • [X] Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
  • [X] If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be Copyright 2015 and onwards Google, Inc.. See an example here.
  • [X] Remove import guards (try import: ... except: ...) if not already done.
  • [X] If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
  • [X] Have you added your language support to tools/text_processing_deployment/pynini_export.py.

PR Type:

  • [ ] New Feature
  • [X] Bugfix
  • [ ] Documentation
  • [ ] Test

If you haven't finished some of the above items you can still open "Draft" PR.

i-am-bbking avatar Apr 28 '25 18:04 i-am-bbking