cldr icon indicating copy to clipboard operation
cldr copied to clipboard

CLDR-18745 LLM_CLDR_explanation.py

Open preetsojitra2712 opened this issue 4 months ago • 3 comments

CLDR-18745

This pull request contains the , feature-complete version of the CLDR validator script, now named LLM_CLDR_explanation.py.

Summary of Changes

This PR introduces several major improvements for usability and more in-depth analysis:

  • Named Command-Line Arguments: The script now uses argparse for clear, named arguments (e.g., --locale-code, --prompt), replacing the previous positional-based input.

  • Mismatch Explanations: When a validation results in a "Mismatch," the script now makes a second call to the LLM to ask for a brief explanation of why it chose its answer over the CLDR standard. This reason is added to the final report.

  • File Output: The final JSON validation report is now automatically saved to a uniquely named file (e.g., report_US_person-height.json) for record-keeping.

  • Improved Comparison Logic: The validation is now more robust. A normalize_unit function handles minor formatting differences (like capitalization, spaces vs. hyphens), ensuring a more practical and accurate comparison.

  • [ ] This PR completes the ticket.

ALLOW_MANY_COMMITS=true

preetsojitra2712 avatar Aug 05 '25 00:08 preetsojitra2712

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

Hi @younies ,

This pull request contains the final, feature-complete version of the CLDR validator script, now named LLM_CLDR_explanation.py. please Review this.

preetsojitra2712 avatar Aug 05 '25 00:08 preetsojitra2712