worldcubeassociation.org icon indicating copy to clipboard operation
worldcubeassociation.org copied to clipboard

Locale YAMLs have hundreds of deleted keys

Open FinnIckler opened this issue 9 months ago • 3 comments

While I was doing the database translation PRs I noticed that a lot of older locale files have loads of keys that are already deleted in the en.yaml.

This actually generated a ton of unnecessary work for me (and it would lead to much more unnecessary work if a translator decides to base a new translation on an old yaml like this).

I let chatgpt generate this script to compare a yaml file against the en.yml

require 'yaml'
require 'set'

# Load YAML files
def load_yaml(file)
  YAML.load_file(file) || {}
end

# Recursively collect all translation keys
def collect_keys(hash, prefix = '')
  keys = {}
  hash.each do |key, value|
    full_key = prefix.empty? ? key.to_s : "#{prefix}.#{key}"
    if value.is_a?(Hash)
      keys.merge!(collect_keys(value, full_key))
    else
      keys[full_key] = value
    end
  end
  keys
end

# Compare translation keys
def find_extra_keys(source_file, target_file)
  source_data = load_yaml(source_file)
  target_data = load_yaml(target_file)

  source_keys = collect_keys(source_data.values.first).keys.to_set
  target_keys = collect_keys(target_data.values.first).keys.to_set

  extra_keys = target_keys - source_keys

  if extra_keys.empty?
    puts "No extra translations found in #{target_file}."
  else
    puts "Extra translations found in #{target_file}:"
    extra_keys.each { |key| puts "  - #{key}" }
  end
end

# Example usage
if ARGV.length < 2
  puts "Usage: ruby script.rb en.yml ca.yml"
  exit 1
end

find_extra_keys(ARGV[0], ARGV[1])

and running it against ca.yml for example yields:

Extra translations found in ca.yml:
  - cutoff.time.zero
  - cutoff.moves.zero
  - cutoff.points.zero
  - common.user.citizen_of
  - common.days.zero
  - common.these_events.zero
  - common.these_events.one
  - common.these_events.other
  - activerecord.attributes.competition.id
  - activerecord.attributes.competition.confirmed
  - activerecord.attributes.competition.showAtAll
  - activerecord.attributes.competition.name
  - activerecord.attributes.competition.name_reason
  - activerecord.attributes.competition.cellName
  - activerecord.attributes.competition.countryId
  - activerecord.attributes.competition.cityName
  - activerecord.attributes.competition.venue
  - activerecord.attributes.competition.venueAddress
  - activerecord.attributes.competition.venueDetails
  - activerecord.attributes.competition.start_date
  - activerecord.attributes.competition.end_date
  - activerecord.attributes.competition.external_website
  - activerecord.attributes.competition.generate_website
  - activerecord.attributes.competition.base_entry_fee_lowest_denomination
  - activerecord.attributes.competition.currency_code
  - activerecord.attributes.competition.delegate_ids
  - activerecord.attributes.competition.trainee_delegate_ids
  - activerecord.attributes.competition.organizer_ids
  - activerecord.attributes.competition.contact
  - activerecord.attributes.competition.information
  - activerecord.attributes.competition.use_wca_registration
  - activerecord.attributes.competition.external_registration_page
  - activerecord.attributes.competition.competitor_limit_enabled
  - activerecord.attributes.competition.competitor_limit
  - activerecord.attributes.competition.competitor_limit_reason
  - activerecord.attributes.competition.guests_enabled
  - activerecord.attributes.competition.receive_registration_emails
  - activerecord.attributes.competition.registration_open
  - activerecord.attributes.competition.registration_close
  - activerecord.attributes.competition.remarks
  - activerecord.attributes.competition.clone_tabs
  - activerecord.attributes.competition.competition_events
  - activerecord.attributes.competition.enable_donations
  - activerecord.attributes.competition.championship_type
  - activerecord.attributes.competition.extra_registration_requirements
  - activerecord.attributes.competition.on_the_spot_registration
  - activerecord.attributes.competition.on_the_spot_entry_fee_lowest_denomination
  - activerecord.attributes.competition.allow_registration_edits
  - activerecord.attributes.competition.allow_registration_self_delete_after_acceptance
  - activerecord.attributes.competition.refund_policy_percent
  - activerecord.attributes.competition.refund_policy_limit_date
  - activerecord.attributes.competition.waiting_list_deadline_date
  - activerecord.attributes.competition.event_change_deadline_date
  - activerecord.attributes.competition.guests_entry_fee_lowest_denomination
  - activerecord.attributes.competition.free_guest_entry_status
  - activerecord.attributes.competition.early_puzzle_submission
  - activerecord.attributes.competition.early_puzzle_submission_reason
  - activerecord.attributes.competition.qualification_results
  - activerecord.attributes.competition.qualification_results_reason
  - activerecord.attributes.competition.event_restrictions
  - activerecord.attributes.competition.event_restrictions_reason
  - activerecord.attributes.competition.main_event_id
  - activerecord.attributes.delegate_report.wdc_feedback_requested
  - activerecord.attributes.delegate_report.wdc_incidents
  - enums.user.delegate_status.trainee_delegate
  - enums.user.delegate_status.candidate_delegate
  - enums.user.delegate_status.delegate
  - enums.user.delegate_status.senior_delegate
  - enums.person.gender.m
  - enums.person.gender.f
  - enums.person.gender.o
  - enums.competition.free_guest_entry_status.unclear
  - enums.competition.free_guest_entry_status.anyone
  - enums.competition.free_guest_entry_status.restricted
  - simple_form.hints.user.delegate_status
  - simple_form.hints.user.region
  - simple_form.hints.competition.name
  - simple_form.hints.competition.cellName
  - simple_form.hints.competition.cityName
  - simple_form.hints.competition.information
  - simple_form.hints.competition.delegate_ids
  - simple_form.hints.competition.trainee_delegate_ids
  - simple_form.hints.competition.organizer_ids
  - simple_form.hints.competition.external_website
  - simple_form.hints.competition.registration_open
  - simple_form.hints.competition.remarks
  - simple_form.hints.competition.clone_tabs
  - simple_form.hints.competition.countryId
  - simple_form.hints.competition.end_date
  - simple_form.hints.competition.generate_website
  - simple_form.hints.competition.external_registration_page
  - simple_form.hints.competition.base_entry_fee_lowest_denomination
  - simple_form.hints.competition.currency_code
  - simple_form.hints.competition.guests_enabled
  - simple_form.hints.competition.enable_donations
  - simple_form.hints.competition.id
  - simple_form.hints.competition.confirmed
  - simple_form.hints.competition.receive_registration_emails
  - simple_form.hints.competition.registration_close
  - simple_form.hints.competition.showAtAll
  - simple_form.hints.competition.start_date
  - simple_form.hints.competition.use_wca_registration
  - simple_form.hints.competition.competitor_limit_enabled
  - simple_form.hints.competition.competitor_limit
  - simple_form.hints.competition.competitor_limit_reason
  - simple_form.hints.competition.venueAddress
  - simple_form.hints.competition.championship_type
  - simple_form.hints.competition.extra_registration_requirements
  - simple_form.hints.competition.on_the_spot_registration
  - simple_form.hints.competition.on_the_spot_entry_fee_lowest_denomination
  - simple_form.hints.competition.allow_registration_edits
  - simple_form.hints.competition.allow_registration_self_delete_after_acceptance
  - simple_form.hints.competition.refund_policy_percent
  - simple_form.hints.competition.refund_policy_limit_date
  - simple_form.hints.competition.waiting_list_deadline_date
  - simple_form.hints.competition.event_change_deadline_date
  - simple_form.hints.competition.guests_entry_fee_lowest_denomination
  - simple_form.hints.competition.free_guest_entry_status
  - simple_form.hints.competition.early_puzzle_submission
  - simple_form.hints.competition.early_puzzle_submission_reason
  - simple_form.hints.competition.qualification_results
  - simple_form.hints.competition.qualification_results_reason
  - simple_form.hints.competition.event_restrictions
  - simple_form.hints.competition.event_restrictions_reason
  - simple_form.hints.competition.main_event_id
  - simple_form.hints.website_contact.inquiry
  - simple_form.hints.website_contact.competition_id
  - simple_form.hints.website_contact.message
  - simple_form.hints.website_contact.name
  - simple_form.hints.website_contact.your_email
  - simple_form.hints.post.sticky
  - simple_form.hints.post.unstick_at
  - simple_form.hints.post.title
  - simple_form.hints.post.tags
  - simple_form.hints.delegate_report.wdc_feedback_requested
  - simple_form.hints.delegate_report.wdc_incidents
  - simple_form.hints.person.dob
  - simple_form.hints.person.countryId
  - simple_form.hints.person.gender
  - simple_form.hints.person.name
  - simple_form.hints.person.wca_id
  - simple_form.hints.person.incorrect_wca_id_claim_count
  - simple_form.hints.person.person_wca_id
  - simple_form.hints.registration.comments
  - simple_form.hints.registration.guests
  - simple_form.hints.registration.status
  - simple_form.hints.wfc.from_date
  - simple_form.hints.wfc.to_date
  - wca.errors.messages.form_error.zero
  - wca.devise.welcome
  - layouts.navigation.db_export
  - layouts.navigation.results_admin
  - layouts.navigation.results_phpmyadmin
  - layouts.navigation.team_leader
  - users.edit.member_of
  - users.errors.must_not_be_present
  - users.errors.senior_has_delegate
  - users.errors.avatar_requires_wca_id
  - registrations.errors.comp_not_found
  - registrations.errors.cannot_be_deleted_and_accepted
  - registrations.registration_info_people.newcomer.zero
  - registrations.registration_info_people.returner.zero
  - registrations.registration_info_people.person.zero
  - registrations.flash.accepted_and_mailed.zero
  - registrations.flash.accepted_and_mailed.one
  - registrations.flash.accepted_and_mailed.other
  - registrations.flash.rejected_and_mailed.zero
  - registrations.flash.rejected_and_mailed.one
  - registrations.flash.rejected_and_mailed.other
  - registrations.flash.deleted_and_mailed.zero
  - registrations.flash.deleted_and_mailed.one
  - registrations.flash.deleted_and_mailed.other
  - registrations.flash.single_deletion_and_mail
  - registrations.flash.cannot_delete
  - registrations.flash.failed
  - registrations.flash.deleted
  - registrations.mailer.pending.mail_subject
  - registrations.mailer.pending.moved_html
  - registrations.mailer.pending.causes.accepted
  - registrations.mailer.pending.causes.withdrawal
  - registrations.mailer.pending.email_if_error
  - registrations.list.country_plural.zero
  - registrations.list.approve
  - registrations.list.reject
  - registrations.list.delete
  - registrations.list.waiting_list
  - registrations.list.approved_registrations
  - registrations.list.deleted_registrations
  - registrations.list.n_events
  - registrations.panel_title
  - registrations.delete
  - registrations.greeting
  - registrations.can_register
  - registrations.check_registration_information
  - registrations.have_registered
  - registrations.contact_organizer
  - registrations.waiting_list
  - registrations.accepted
  - registrations.payment_form.title
  - registrations.payment_form.hints.card_information
  - registrations.payment_form.labels.entry_fees
  - registrations.payment_form.labels.card_information
  - registrations.payment_form.labels.cardholder_name
  - registrations.payment_form.alerts.not_a_number
  - registrations.payment_form.errors.intent_not_found
  - registrations.payment_form.errors.cardholder_name
  - registrations.payment_button_text
  - competitions.messages.stripe_connected
  - competitions.messages.stripe_not_connected
  - competitions.competition_info.organizer_plural.zero
  - competitions.competition_info.delegate.zero
  - competitions.competition_info.guests_free.anyone
  - competitions.index.disclaimer
  - competitions.index.no_access
  - competitions.index.no_comp_match
  - competitions.competition_form.name_reason_html
  - competitions.competition_form.venue_html
  - competitions.competition_form.venue_details_html
  - competitions.competition_form.contact_html
  - competitions.competition_form.labels.user_settings.receive_registration_emails
  - competitions.competition_form.labels.registration.allow_self_delete_after_acceptance
  - competitions.competition_form.labels.clone_tabs
  - competitions.competition_form.hints.venue.coordinates
  - competitions.competition_form.hints.championships
  - competitions.competition_form.hints.registration.allow_self_delete_after_acceptance
  - competitions.competition_form.hints.clone_tabs
  - competitions.competition_form.choices.registration.allow_self_delete_after_acceptance.true
  - competitions.competition_form.choices.registration.allow_self_delete_after_acceptance.false
  - competitions.errors.waiting_list_deadline_after_start
  - competitions.nearby_competitions.competitions.zero
  - competitions.nearby_competitions.competitions.one
  - competitions.nearby_competitions.competitions.other
  - competitions.nearby_competitions.label
  - competitions.nearby_competitions.label_admin
  - competitions.nearby_competitions.nearby_admin
  - competitions.nearby_competitions.no_comp_nearby
  - competitions.nearby_competitions.no_date_yet
  - competitions.nearby_competitions.no_location_yet
  - competitions.nearby_competitions.within
  - competitions.nearby_competitions.show
  - competitions.nearby_competitions.name
  - competitions.nearby_competitions.delegates
  - competitions.nearby_competitions.date
  - competitions.nearby_competitions.location
  - competitions.nearby_competitions.distance
  - competitions.nearby_competitions.before
  - competitions.nearby_competitions.after
  - competitions.nearby_competitions.starts_on
  - competitions.nearby_competitions.ends_on
  - competitions.nearby_competitions.limit
  - competitions.nearby_competitions.competitors
  - competitions.nearby_competitions.events
  - competitions.nearby_competitions.show_events
  - competitions.nearby_competitions.hide_events
  - competitions.upload_results
  - competitions.schedule.display_as.label
  - competitions.schedule.room
  - competitions.schedule.venue_information_html
  - competitions.schedule.multiple_venues_available
  - results.table_elements.citizen_of
  - contacts.website.faq_note_html
  - contacts.website.title
  - contacts.website.inquiries.competition
  - contacts.website.inquiries.competitions_in_general
  - contacts.website.inquiries.wca_id_or_profile
  - contacts.website.inquiries.media
  - contacts.website.inquiries.software
  - contacts.website.inquiries.different
  - contacts.website.submit
  - about.structure.teams_committees_councils
  - about.structure.board.name
  - about.structure.committees
  - about.structure.senior_member
  - about.structure.leader
  - about.structure.officers.name
  - about.structure.officers.description
  - about.structure.chair.name
  - about.structure.executive_director.name
  - about.structure.secretary.name
  - about.structure.vice_chair.name
  - about.structure.treasurer.name
  - about.structure.wdc.name
  - about.structure.wdc.description
  - about.structure.wdpc.name
  - about.structure.wdpc.description
  - about.structure.wrc.name
  - about.structure.wrc.description
  - about.structure.wrt.name
  - about.structure.wrt.description
  - about.structure.wst.name
  - about.structure.wst.description
  - about.structure.wct.name
  - about.structure.wct.description
  - about.structure.wfc.name
  - about.structure.wfc.description
  - about.structure.wec.name
  - about.structure.wec.description
  - about.structure.weat.name
  - about.structure.weat.description
  - about.structure.wqac.name
  - about.structure.wqac.description
  - about.structure.wcat.name
  - about.structure.wcat.description
  - about.structure.wmt.name
  - about.structure.wmt.description
  - about.structure.wac.name
  - about.structure.wac.description
  - about.structure.wsot.name
  - about.structure.wsot.description
  - about.structure.wat.name
  - about.structure.wat.description
  - about.structure.banned.name
  - about.structure.wct_china.name
  - about.structure.probation.name
  - about.structure.wst_admin.name
  - contact.title
  - contact.info_alert_html
  - contact.questions_and_comments_html
  - contact.committees_and_teams_html
  - contact.members
  - contact.board.info_html
  - contact.wct.info_html
  - contact.wdc.info_html
  - contact.wdpc.info_html
  - contact.wrc.info_html
  - contact.wrt.info_html
  - contact.wst.info_html
  - contact.wfc.info_html
  - contact.wec.info_html
  - contact.weat.info_html
  - contact.wqac.info_html
  - contact.wcat.info_html
  - contact.wmt.info_html
  - contact.wsot.info_html
  - contact.wat.info_html
  - contact.councils_html
  - contact.wac.info_html
  - logo.paragraphs.1
  - logo.paragraphs.2
  - logo.paragraphs.3
  - logo.paragraphs.4
  - relations.relation.title
  - relations.relation.find_relations
  - relations.relation.show_relation
  - relations.relation.connectives.was_at
  - relations.relation.connectives.together_with
  - relations.messages.invalid_wca_ids

I have done some exemplary checks and those keys really don't exist.

We should add a test that checks that there are no superfluous translations in the other locales and in that PR remove all those. The above script can be easily changed to do this automatically.

FinnIckler avatar Apr 01 '25 14:04 FinnIckler

I am quite sure a few of those keys are legitimate, namely the one, many, other etc. plural keys. Many languages have different pluralization rules than English, so it is normal and expected that the YAML files are not 100% perfectly congruent all the time.

With that being said, there's a hefty share of keys in there that are indeed orphaned. CC @jonatanklosko for Internationalize (and other general wisdom 🧙‍♂️)

gregorbg avatar Apr 01 '25 14:04 gregorbg

I believe the issue is simply that the given translation file hasn't been updated for a while. To verify this, I created a new translation in Internationalize, pointing to the latest en.yml and ca.yml, then I exported the ca.yml translation without making any changes. After that, the outdated keys are gone and the script returns the following:

Extra translations found in ca.yml:
  - cutoff.time.zero
  - cutoff.moves.zero
  - cutoff.points.zero
  - common.days.zero
  - wca.errors.messages.form_error.zero
  - registrations.registration_info_people.newcomer.zero
  - registrations.registration_info_people.returner.zero
  - registrations.registration_info_people.person.zero
  - registrations.list.country_plural
  - competitions.competition_info.organizer_plural.zero
  - competitions.competition_info.delegate.zero

As @gregorbg mentioned, those are specific to different pluralization rules.

jonatanklosko avatar Apr 02 '25 12:04 jonatanklosko

FTR that the translations status page also shows the unused keys for each local.

jonatanklosko avatar Apr 02 '25 13:04 jonatanklosko