NeMo-text-processing
NeMo-text-processing copied to clipboard
Jp itn 20240221
What does this PR do ?
Add a one line overview of what this PR aims to accomplish. PR for Japanese itn instead of #101
Before your PR is "Ready for review"
Pre checks:
- [ *] Have you signed your commits? Use
git commit -sto sign. - [ *] Do all unittests finish successfully before sending PR?
pytestor (if your machine does not have GPU)pytest --cpufrom the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')).- Sparrowhawk tests
bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
- [ *] If you are adding a new feature: Have you added test cases for both
pytestand Sparrowhawk here. - [* ] Have you added
__init__.pyfor every folder and subfolder, includingdatafolder which has .TSV files? - [ *] Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
- [ *] Have you added the correct license header
Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.to all newly added Python files? - [ ] If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be
Copyright 2015 and onwards Google, Inc.. See an example here. - [ *] Remove import guards (
try import: ... except: ...) if not already done. - [ ] If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
- [ *] Have you added your language support to tools/text_processing_deployment/pynini_export.py.
PR Type:
- [ *] New Feature
- [ *] Bugfix
- [ ] Documentation
- [ ] Test
If you haven't finished some of the above items you can still open "Draft" PR.
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.
The updated files from previous round of review.
- Jenkinsfile (to add test for Japanese and date update)
- tests/nemo_text_processing/ja/data_inverse_text_normalization/test_cases_fraction.txt (changed the text format, changed 1 3/4 to 1荷3/4 and 1と3/4)
- tools/text_processing_deployment/pynini_export.py (added ITN Pocess fst for space issue)
- nemo_text_processing/inverse_text_normalization/ja/verbalizers/post_processing.py (to resolve space issue)
- updates later received but also already merged to the main including the files under tools/text_processing_deployment
Updating according to XInchao's comment.
updated file nemo_text_processing/inverse_text_normalization/ja/verbalizers/date.py at line 36 for comments updated file nemo_text_processing/inverse_text_normalization/ja/taggers/date.py at line 37 and from line 89 to line 108