CLDR-18745 LLM_CLDR_explanation.py
CLDR-18745
This pull request contains the , feature-complete version of the CLDR validator script, now named LLM_CLDR_explanation.py.
Summary of Changes
This PR introduces several major improvements for usability and more in-depth analysis:
-
Named Command-Line Arguments: The script now uses
argparsefor clear, named arguments (e.g.,--locale-code,--prompt), replacing the previous positional-based input. -
Mismatch Explanations: When a validation results in a "Mismatch," the script now makes a second call to the LLM to ask for a brief explanation of why it chose its answer over the CLDR standard. This reason is added to the final report.
-
File Output: The final JSON validation report is now automatically saved to a uniquely named file (e.g.,
report_US_person-height.json) for record-keeping. -
Improved Comparison Logic: The validation is now more robust. A
normalize_unitfunction handles minor formatting differences (like capitalization, spaces vs. hyphens), ensuring a more practical and accurate comparison. -
[ ] This PR completes the ticket.
ALLOW_MANY_COMMITS=true
Hooray! The files in the branch are the same across the force-push. 😃
~ Your Friendly Jira-GitHub PR Checker Bot
Hooray! The files in the branch are the same across the force-push. 😃
~ Your Friendly Jira-GitHub PR Checker Bot
Hi @younies ,
This pull request contains the final, feature-complete version of the CLDR validator script, now named LLM_CLDR_explanation.py.
please Review this.