cldr icon indicating copy to clipboard operation
cldr copied to clipboard

CLDR-17582 Cleanup English annotations

Open macchiati opened this issue 1 year ago β€’ 2 comments

CLDR-17582

  • This cleans up the English annotations according to the directions given to vetters.
  • The multiword search keys are split up.
  • Duplicates (or near duplicates), synonyms, and low-value terms are removed.
    • or at least reduced, with a focus on items that had a very large number of annotations.
  • Reduce the maximum number of search terms on items.
  • Other misc. cleanup.

When reviewing, note that breaking up keywords and then alphabetizing means that phrases are distributed. Example:

  • cp="πŸ˜„"
  • old: awesome | face | grin | grinning face with big eyes | happy | mouth | open | smile | smiling | smiling face with open mouth | teeth | yay
  • new: awesome | big | eyes | face | grin | grinning | happy | mouth | open | smile | smiling | teeth | yay

We can't expect a line-by-line review, so please just spot-check looking for anything that is terrible: We can tweak the English later on before release; the goal here is to adhere more to the guidelines to lessen the chances vetters will be misled (though we caution them that they need to look at the associations in their languages, not English!)

  • [ ] This PR completes the ticket.

ALLOW_MANY_COMMITS=true

macchiati avatar May 24 '24 20:05 macchiati

The term [eyes] doesn't distinguish much among smilies!

But I don't think that is a showstopper.


What might be useful is a report that mapped search keyword to emoji?

On Fri, May 24, 2024, 14:49 Fredrik @.***> wrote:

@.**** commented on this pull request.

In common/annotations/en.xml https://github.com/unicode-org/cldr/pull/3751#discussion_r1614061093:

	<annotation cp="🫠" type="tts">melting face</annotation>
  <annotation cp="πŸ˜‰">face | flirt | heartbreaker | sexy | slide | tease | wink | winking | winks</annotation>
  <annotation cp="πŸ˜‰" type="tts">winking face</annotation>
  • <annotation cp="😊">blush | eye | face | glad | satisfied | smile | smiling | smiling face with smiling eyes</annotation>
    
  • <annotation cp="😊">blush | eye | eyes | face | glad | satisfied | smile | smiling</annotation>
    

Not sure if this is the emoji I would expect if searching for "eye" or "eyes"?

β€” Reply to this email directly, view it on GitHub https://github.com/unicode-org/cldr/pull/3751#pullrequestreview-2078102738, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJLEMAOXVJNTBIUL5PM6Z3ZD6YWLAVCNFSM6AAAAABIIGWDZ6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDANZYGEYDENZTHA . You are receiving this because you authored the thread.Message ID: @.***>

macchiati avatar May 24 '24 23:05 macchiati

The term [eyes] doesn't distinguish much among smilies! But I don't think that is a showstopper. … ____ What might be useful is a report that mapped search keyword to emoji? On Fri, May 24, 2024, 14:49 Fredrik @.> wrote: @.* commented on this pull request. ------------------------------ In common/annotations/en.xml <#3751 (comment)>: > melting face face | flirt | heartbreaker | sexy | slide | tease | wink | winking | winks winking face - blush | eye | face | glad | satisfied | smile | smiling | smiling face with smiling eyes + blush | eye | eyes | face | glad | satisfied | smile | smiling Not sure if this is the emoji I would expect if searching for "eye" or "eyes"? β€” Reply to this email directly, view it on GitHub <#3751 (review)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJLEMAOXVJNTBIUL5PM6Z3ZD6YWLAVCNFSM6AAAAABIIGWDZ6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDANZYGEYDENZTHA . You are receiving this because you authored the thread.Message ID: @.***>

I think before we send this out to translation into some 70-80 languages, we should remove superfluous terms though, no?

stenshamn avatar May 24 '24 23:05 stenshamn

I think there are too many superfluous individual terms with no relation to the emoji that will not be beneficial for the function. Before sending this out for translation in all the CLDR languages, we should clean that up to avoid a lot of confusion and wasted efforts, IMHO.

See what you think now; there is a fair amount of cleanup, so I think it is overall better than what we are showing to translators right now.

macchiati avatar May 28 '24 16:05 macchiati

Spec update LGTM

AEApple avatar May 29 '24 19:05 AEApple

Don't worry about the jira-ticket commit; will squash once it is approved.

macchiati avatar May 30 '24 03:05 macchiati

Hooray! The files in the branch are the same across the force-push. πŸ˜ƒ

~ Your Friendly Jira-GitHub PR Checker Bot