Locale YAMLs have hundreds of deleted keys
While I was doing the database translation PRs I noticed that a lot of older locale files have loads of keys that are already deleted in the en.yaml.
This actually generated a ton of unnecessary work for me (and it would lead to much more unnecessary work if a translator decides to base a new translation on an old yaml like this).
I let chatgpt generate this script to compare a yaml file against the en.yml
require 'yaml'
require 'set'
# Load YAML files
def load_yaml(file)
YAML.load_file(file) || {}
end
# Recursively collect all translation keys
def collect_keys(hash, prefix = '')
keys = {}
hash.each do |key, value|
full_key = prefix.empty? ? key.to_s : "#{prefix}.#{key}"
if value.is_a?(Hash)
keys.merge!(collect_keys(value, full_key))
else
keys[full_key] = value
end
end
keys
end
# Compare translation keys
def find_extra_keys(source_file, target_file)
source_data = load_yaml(source_file)
target_data = load_yaml(target_file)
source_keys = collect_keys(source_data.values.first).keys.to_set
target_keys = collect_keys(target_data.values.first).keys.to_set
extra_keys = target_keys - source_keys
if extra_keys.empty?
puts "No extra translations found in #{target_file}."
else
puts "Extra translations found in #{target_file}:"
extra_keys.each { |key| puts " - #{key}" }
end
end
# Example usage
if ARGV.length < 2
puts "Usage: ruby script.rb en.yml ca.yml"
exit 1
end
find_extra_keys(ARGV[0], ARGV[1])
and running it against ca.yml for example yields:
Extra translations found in ca.yml:
- cutoff.time.zero
- cutoff.moves.zero
- cutoff.points.zero
- common.user.citizen_of
- common.days.zero
- common.these_events.zero
- common.these_events.one
- common.these_events.other
- activerecord.attributes.competition.id
- activerecord.attributes.competition.confirmed
- activerecord.attributes.competition.showAtAll
- activerecord.attributes.competition.name
- activerecord.attributes.competition.name_reason
- activerecord.attributes.competition.cellName
- activerecord.attributes.competition.countryId
- activerecord.attributes.competition.cityName
- activerecord.attributes.competition.venue
- activerecord.attributes.competition.venueAddress
- activerecord.attributes.competition.venueDetails
- activerecord.attributes.competition.start_date
- activerecord.attributes.competition.end_date
- activerecord.attributes.competition.external_website
- activerecord.attributes.competition.generate_website
- activerecord.attributes.competition.base_entry_fee_lowest_denomination
- activerecord.attributes.competition.currency_code
- activerecord.attributes.competition.delegate_ids
- activerecord.attributes.competition.trainee_delegate_ids
- activerecord.attributes.competition.organizer_ids
- activerecord.attributes.competition.contact
- activerecord.attributes.competition.information
- activerecord.attributes.competition.use_wca_registration
- activerecord.attributes.competition.external_registration_page
- activerecord.attributes.competition.competitor_limit_enabled
- activerecord.attributes.competition.competitor_limit
- activerecord.attributes.competition.competitor_limit_reason
- activerecord.attributes.competition.guests_enabled
- activerecord.attributes.competition.receive_registration_emails
- activerecord.attributes.competition.registration_open
- activerecord.attributes.competition.registration_close
- activerecord.attributes.competition.remarks
- activerecord.attributes.competition.clone_tabs
- activerecord.attributes.competition.competition_events
- activerecord.attributes.competition.enable_donations
- activerecord.attributes.competition.championship_type
- activerecord.attributes.competition.extra_registration_requirements
- activerecord.attributes.competition.on_the_spot_registration
- activerecord.attributes.competition.on_the_spot_entry_fee_lowest_denomination
- activerecord.attributes.competition.allow_registration_edits
- activerecord.attributes.competition.allow_registration_self_delete_after_acceptance
- activerecord.attributes.competition.refund_policy_percent
- activerecord.attributes.competition.refund_policy_limit_date
- activerecord.attributes.competition.waiting_list_deadline_date
- activerecord.attributes.competition.event_change_deadline_date
- activerecord.attributes.competition.guests_entry_fee_lowest_denomination
- activerecord.attributes.competition.free_guest_entry_status
- activerecord.attributes.competition.early_puzzle_submission
- activerecord.attributes.competition.early_puzzle_submission_reason
- activerecord.attributes.competition.qualification_results
- activerecord.attributes.competition.qualification_results_reason
- activerecord.attributes.competition.event_restrictions
- activerecord.attributes.competition.event_restrictions_reason
- activerecord.attributes.competition.main_event_id
- activerecord.attributes.delegate_report.wdc_feedback_requested
- activerecord.attributes.delegate_report.wdc_incidents
- enums.user.delegate_status.trainee_delegate
- enums.user.delegate_status.candidate_delegate
- enums.user.delegate_status.delegate
- enums.user.delegate_status.senior_delegate
- enums.person.gender.m
- enums.person.gender.f
- enums.person.gender.o
- enums.competition.free_guest_entry_status.unclear
- enums.competition.free_guest_entry_status.anyone
- enums.competition.free_guest_entry_status.restricted
- simple_form.hints.user.delegate_status
- simple_form.hints.user.region
- simple_form.hints.competition.name
- simple_form.hints.competition.cellName
- simple_form.hints.competition.cityName
- simple_form.hints.competition.information
- simple_form.hints.competition.delegate_ids
- simple_form.hints.competition.trainee_delegate_ids
- simple_form.hints.competition.organizer_ids
- simple_form.hints.competition.external_website
- simple_form.hints.competition.registration_open
- simple_form.hints.competition.remarks
- simple_form.hints.competition.clone_tabs
- simple_form.hints.competition.countryId
- simple_form.hints.competition.end_date
- simple_form.hints.competition.generate_website
- simple_form.hints.competition.external_registration_page
- simple_form.hints.competition.base_entry_fee_lowest_denomination
- simple_form.hints.competition.currency_code
- simple_form.hints.competition.guests_enabled
- simple_form.hints.competition.enable_donations
- simple_form.hints.competition.id
- simple_form.hints.competition.confirmed
- simple_form.hints.competition.receive_registration_emails
- simple_form.hints.competition.registration_close
- simple_form.hints.competition.showAtAll
- simple_form.hints.competition.start_date
- simple_form.hints.competition.use_wca_registration
- simple_form.hints.competition.competitor_limit_enabled
- simple_form.hints.competition.competitor_limit
- simple_form.hints.competition.competitor_limit_reason
- simple_form.hints.competition.venueAddress
- simple_form.hints.competition.championship_type
- simple_form.hints.competition.extra_registration_requirements
- simple_form.hints.competition.on_the_spot_registration
- simple_form.hints.competition.on_the_spot_entry_fee_lowest_denomination
- simple_form.hints.competition.allow_registration_edits
- simple_form.hints.competition.allow_registration_self_delete_after_acceptance
- simple_form.hints.competition.refund_policy_percent
- simple_form.hints.competition.refund_policy_limit_date
- simple_form.hints.competition.waiting_list_deadline_date
- simple_form.hints.competition.event_change_deadline_date
- simple_form.hints.competition.guests_entry_fee_lowest_denomination
- simple_form.hints.competition.free_guest_entry_status
- simple_form.hints.competition.early_puzzle_submission
- simple_form.hints.competition.early_puzzle_submission_reason
- simple_form.hints.competition.qualification_results
- simple_form.hints.competition.qualification_results_reason
- simple_form.hints.competition.event_restrictions
- simple_form.hints.competition.event_restrictions_reason
- simple_form.hints.competition.main_event_id
- simple_form.hints.website_contact.inquiry
- simple_form.hints.website_contact.competition_id
- simple_form.hints.website_contact.message
- simple_form.hints.website_contact.name
- simple_form.hints.website_contact.your_email
- simple_form.hints.post.sticky
- simple_form.hints.post.unstick_at
- simple_form.hints.post.title
- simple_form.hints.post.tags
- simple_form.hints.delegate_report.wdc_feedback_requested
- simple_form.hints.delegate_report.wdc_incidents
- simple_form.hints.person.dob
- simple_form.hints.person.countryId
- simple_form.hints.person.gender
- simple_form.hints.person.name
- simple_form.hints.person.wca_id
- simple_form.hints.person.incorrect_wca_id_claim_count
- simple_form.hints.person.person_wca_id
- simple_form.hints.registration.comments
- simple_form.hints.registration.guests
- simple_form.hints.registration.status
- simple_form.hints.wfc.from_date
- simple_form.hints.wfc.to_date
- wca.errors.messages.form_error.zero
- wca.devise.welcome
- layouts.navigation.db_export
- layouts.navigation.results_admin
- layouts.navigation.results_phpmyadmin
- layouts.navigation.team_leader
- users.edit.member_of
- users.errors.must_not_be_present
- users.errors.senior_has_delegate
- users.errors.avatar_requires_wca_id
- registrations.errors.comp_not_found
- registrations.errors.cannot_be_deleted_and_accepted
- registrations.registration_info_people.newcomer.zero
- registrations.registration_info_people.returner.zero
- registrations.registration_info_people.person.zero
- registrations.flash.accepted_and_mailed.zero
- registrations.flash.accepted_and_mailed.one
- registrations.flash.accepted_and_mailed.other
- registrations.flash.rejected_and_mailed.zero
- registrations.flash.rejected_and_mailed.one
- registrations.flash.rejected_and_mailed.other
- registrations.flash.deleted_and_mailed.zero
- registrations.flash.deleted_and_mailed.one
- registrations.flash.deleted_and_mailed.other
- registrations.flash.single_deletion_and_mail
- registrations.flash.cannot_delete
- registrations.flash.failed
- registrations.flash.deleted
- registrations.mailer.pending.mail_subject
- registrations.mailer.pending.moved_html
- registrations.mailer.pending.causes.accepted
- registrations.mailer.pending.causes.withdrawal
- registrations.mailer.pending.email_if_error
- registrations.list.country_plural.zero
- registrations.list.approve
- registrations.list.reject
- registrations.list.delete
- registrations.list.waiting_list
- registrations.list.approved_registrations
- registrations.list.deleted_registrations
- registrations.list.n_events
- registrations.panel_title
- registrations.delete
- registrations.greeting
- registrations.can_register
- registrations.check_registration_information
- registrations.have_registered
- registrations.contact_organizer
- registrations.waiting_list
- registrations.accepted
- registrations.payment_form.title
- registrations.payment_form.hints.card_information
- registrations.payment_form.labels.entry_fees
- registrations.payment_form.labels.card_information
- registrations.payment_form.labels.cardholder_name
- registrations.payment_form.alerts.not_a_number
- registrations.payment_form.errors.intent_not_found
- registrations.payment_form.errors.cardholder_name
- registrations.payment_button_text
- competitions.messages.stripe_connected
- competitions.messages.stripe_not_connected
- competitions.competition_info.organizer_plural.zero
- competitions.competition_info.delegate.zero
- competitions.competition_info.guests_free.anyone
- competitions.index.disclaimer
- competitions.index.no_access
- competitions.index.no_comp_match
- competitions.competition_form.name_reason_html
- competitions.competition_form.venue_html
- competitions.competition_form.venue_details_html
- competitions.competition_form.contact_html
- competitions.competition_form.labels.user_settings.receive_registration_emails
- competitions.competition_form.labels.registration.allow_self_delete_after_acceptance
- competitions.competition_form.labels.clone_tabs
- competitions.competition_form.hints.venue.coordinates
- competitions.competition_form.hints.championships
- competitions.competition_form.hints.registration.allow_self_delete_after_acceptance
- competitions.competition_form.hints.clone_tabs
- competitions.competition_form.choices.registration.allow_self_delete_after_acceptance.true
- competitions.competition_form.choices.registration.allow_self_delete_after_acceptance.false
- competitions.errors.waiting_list_deadline_after_start
- competitions.nearby_competitions.competitions.zero
- competitions.nearby_competitions.competitions.one
- competitions.nearby_competitions.competitions.other
- competitions.nearby_competitions.label
- competitions.nearby_competitions.label_admin
- competitions.nearby_competitions.nearby_admin
- competitions.nearby_competitions.no_comp_nearby
- competitions.nearby_competitions.no_date_yet
- competitions.nearby_competitions.no_location_yet
- competitions.nearby_competitions.within
- competitions.nearby_competitions.show
- competitions.nearby_competitions.name
- competitions.nearby_competitions.delegates
- competitions.nearby_competitions.date
- competitions.nearby_competitions.location
- competitions.nearby_competitions.distance
- competitions.nearby_competitions.before
- competitions.nearby_competitions.after
- competitions.nearby_competitions.starts_on
- competitions.nearby_competitions.ends_on
- competitions.nearby_competitions.limit
- competitions.nearby_competitions.competitors
- competitions.nearby_competitions.events
- competitions.nearby_competitions.show_events
- competitions.nearby_competitions.hide_events
- competitions.upload_results
- competitions.schedule.display_as.label
- competitions.schedule.room
- competitions.schedule.venue_information_html
- competitions.schedule.multiple_venues_available
- results.table_elements.citizen_of
- contacts.website.faq_note_html
- contacts.website.title
- contacts.website.inquiries.competition
- contacts.website.inquiries.competitions_in_general
- contacts.website.inquiries.wca_id_or_profile
- contacts.website.inquiries.media
- contacts.website.inquiries.software
- contacts.website.inquiries.different
- contacts.website.submit
- about.structure.teams_committees_councils
- about.structure.board.name
- about.structure.committees
- about.structure.senior_member
- about.structure.leader
- about.structure.officers.name
- about.structure.officers.description
- about.structure.chair.name
- about.structure.executive_director.name
- about.structure.secretary.name
- about.structure.vice_chair.name
- about.structure.treasurer.name
- about.structure.wdc.name
- about.structure.wdc.description
- about.structure.wdpc.name
- about.structure.wdpc.description
- about.structure.wrc.name
- about.structure.wrc.description
- about.structure.wrt.name
- about.structure.wrt.description
- about.structure.wst.name
- about.structure.wst.description
- about.structure.wct.name
- about.structure.wct.description
- about.structure.wfc.name
- about.structure.wfc.description
- about.structure.wec.name
- about.structure.wec.description
- about.structure.weat.name
- about.structure.weat.description
- about.structure.wqac.name
- about.structure.wqac.description
- about.structure.wcat.name
- about.structure.wcat.description
- about.structure.wmt.name
- about.structure.wmt.description
- about.structure.wac.name
- about.structure.wac.description
- about.structure.wsot.name
- about.structure.wsot.description
- about.structure.wat.name
- about.structure.wat.description
- about.structure.banned.name
- about.structure.wct_china.name
- about.structure.probation.name
- about.structure.wst_admin.name
- contact.title
- contact.info_alert_html
- contact.questions_and_comments_html
- contact.committees_and_teams_html
- contact.members
- contact.board.info_html
- contact.wct.info_html
- contact.wdc.info_html
- contact.wdpc.info_html
- contact.wrc.info_html
- contact.wrt.info_html
- contact.wst.info_html
- contact.wfc.info_html
- contact.wec.info_html
- contact.weat.info_html
- contact.wqac.info_html
- contact.wcat.info_html
- contact.wmt.info_html
- contact.wsot.info_html
- contact.wat.info_html
- contact.councils_html
- contact.wac.info_html
- logo.paragraphs.1
- logo.paragraphs.2
- logo.paragraphs.3
- logo.paragraphs.4
- relations.relation.title
- relations.relation.find_relations
- relations.relation.show_relation
- relations.relation.connectives.was_at
- relations.relation.connectives.together_with
- relations.messages.invalid_wca_ids
I have done some exemplary checks and those keys really don't exist.
We should add a test that checks that there are no superfluous translations in the other locales and in that PR remove all those. The above script can be easily changed to do this automatically.
I am quite sure a few of those keys are legitimate, namely the one, many, other etc. plural keys. Many languages have different pluralization rules than English, so it is normal and expected that the YAML files are not 100% perfectly congruent all the time.
With that being said, there's a hefty share of keys in there that are indeed orphaned. CC @jonatanklosko for Internationalize (and other general wisdom 🧙♂️)
I believe the issue is simply that the given translation file hasn't been updated for a while. To verify this, I created a new translation in Internationalize, pointing to the latest en.yml and ca.yml, then I exported the ca.yml translation without making any changes. After that, the outdated keys are gone and the script returns the following:
Extra translations found in ca.yml:
- cutoff.time.zero
- cutoff.moves.zero
- cutoff.points.zero
- common.days.zero
- wca.errors.messages.form_error.zero
- registrations.registration_info_people.newcomer.zero
- registrations.registration_info_people.returner.zero
- registrations.registration_info_people.person.zero
- registrations.list.country_plural
- competitions.competition_info.organizer_plural.zero
- competitions.competition_info.delegate.zero
As @gregorbg mentioned, those are specific to different pluralization rules.
FTR that the translations status page also shows the unused keys for each local.