NeMo-text-processing icon indicating copy to clipboard operation
NeMo-text-processing copied to clipboard

Jp itn 20240221

Open BuyuanCui opened this issue 1 year ago • 2 comments

What does this PR do ?

Add a one line overview of what this PR aims to accomplish. PR for Japanese itn instead of #101

Before your PR is "Ready for review"

Pre checks:

  • [ *] Have you signed your commits? Use git commit -s to sign.
  • [ *] Do all unittests finish successfully before sending PR?
    1. pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
    2. Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
  • [ *] If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
  • [* ] Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
  • [ *] Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
  • [ *] Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
  • [ ] If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be Copyright 2015 and onwards Google, Inc.. See an example here.
  • [ *] Remove import guards (try import: ... except: ...) if not already done.
  • [ ] If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
  • [ *] Have you added your language support to tools/text_processing_deployment/pynini_export.py.

PR Type:

  • [ *] New Feature
  • [ *] Bugfix
  • [ ] Documentation
  • [ ] Test

If you haven't finished some of the above items you can still open "Draft" PR.

BuyuanCui avatar Feb 21 '24 21:02 BuyuanCui

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions[bot] avatar Mar 09 '24 01:03 github-actions[bot]

This PR was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Mar 16 '24 01:03 github-actions[bot]

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions[bot] avatar May 04 '24 01:05 github-actions[bot]

This PR was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar May 11 '24 01:05 github-actions[bot]

The updated files from previous round of review.

  • Jenkinsfile (to add test for Japanese and date update)
  • tests/nemo_text_processing/ja/data_inverse_text_normalization/test_cases_fraction.txt (changed the text format, changed 1 3/4 to 1荷3/4 and 1と3/4)
  • tools/text_processing_deployment/pynini_export.py (added ITN Pocess fst for space issue)
  • nemo_text_processing/inverse_text_normalization/ja/verbalizers/post_processing.py (to resolve space issue)
  • updates later received but also already merged to the main including the files under tools/text_processing_deployment

BuyuanCui avatar Jul 10 '24 00:07 BuyuanCui

Updating according to XInchao's comment.

BuyuanCui avatar Jul 16 '24 15:07 BuyuanCui

updated file nemo_text_processing/inverse_text_normalization/ja/verbalizers/date.py at line 36 for comments updated file nemo_text_processing/inverse_text_normalization/ja/taggers/date.py at line 37 and from line 89 to line 108

BuyuanCui avatar Jul 16 '24 20:07 BuyuanCui