site-kit-wp icon indicating copy to clipboard operation
site-kit-wp copied to clipboard

Audit and consolidate gettext/translation strings for similar translations

Open tofumatt opened this issue 2 years ago • 3 comments

We have a lot of small translation strings like “Get help”, “OK, got it!“, “Learn more” etc. throughout the codebase, but some of them have translation strings that are off-by-case, resulting in duplicate translation strings.

We should audit all existing strings, especially short strings, for duplicates/similarities and consolidate them. For instance, the "Ok, got it!" string has "duplicate" strings: https://github.com/google/site-kit-wp/pull/5640#discussion_r937935938

@aaemnnosttv pointed out we can use WP-CLI to collect all of the strings, then identify duplicates and fix them up. This issue should cover that 🙂


Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

  • There should be no similar-but-duplicated gettext strings where the only difference is punctuation or casing, eg. "Get help", "Get Help", and "Get help!" should all be consolidated into one string across the plugin.

Implementation Brief

Use the WP CLI to generate a list of the translatable strings used in Site Kit

  • Use the command wp i18n make-pot to generate a POT file containing the strings and their locations. See https://developer.wordpress.org/cli/commands/i18n/make-pot/
  • In testing while writing the IB, generating the POT was initially very slow. It's possible to speed it up dramatically by only enabling the PHP extensions needed for the command as described here: https://github.com/wp-cli/i18n-command/issues/80#issuecomment-572451958
  • The following invocation was found to work well and quickly generate the POT:
# First enter the Docker shell containing the WP CLI:
10updocker shell

# Change to the plugin directory:
cd wp-content/plugins/google-site-kit

# Then invoke as follows, generating the POT in languages/google-site-kit.pot:
php -n -dextension=tokenizer.so -dextension=gettext.so -dextension=phar.so -dextension=json.so -dextension=mbstring.so \
  $(which wp) i18n make-pot . languages/google-site-kit.pot --slug=google-site-kit \
  --exclude="node_modules,third-party,vendor,plugins,dist"

Find similar strings in the POT and harmonise them in Site Kit source

  • Write a script to extract the translatable strings from the POT.
  • Transform each string by lowercasing, removing all punctuation, and replacing any sequence of one or more spaces with a single space.
  • Identify any duplicates of the transformed strings.
  • Locate the duplicates within the POT and cross reference their location within Site Kit source code.
  • For each duplicate, examine the usage in Site Kit to identify whether the duplicate is a case of inconsistent usage of what should be the same string.
    • If so, replace the version(s) of the string that are incorrect with the correct version of the string. If there is ambiguity as to which version of the string to use, take a look at other strings within Site Kit to gauge which is most in keeping with the rest of the product.
    • For legitimate variations of the string, leave things as they are.

Appropriate judgement should be applied when determining whether a string is a candidate for fixing.

For example, searching for the transformed string /ok.*got.*it/ returns:

  • Four instances of the string OK, Got it!.
  • One instance of the string OK, got it!.
  • All of these are dismiss button labels.

Clearly the intention is for these to be the same string and OK, got it! should be replaced with OK, Got it!.

Whereas, searching for /the.*code.*is.*controlled.*by.*the.*tag.*manager.*module/ returns:

  • Two instances of the string The code is controlled by the Tag Manager module. in UseSnippetSwitch variants used in the context of the Setup view.
  • One instance of the string The code is controlled by the Tag Manager module in the context of a Settings view.

Looking at these strings in the UI, it's clear that the full stop version is correct in the Setup view, while the version without the full stop is correct in the Settings view. In this case, the strings are fine as they are and no change needs to be made.

Setup view - full stops are consistent
image

Settings view - no full stop is fine
image

Test Coverage

  • Update any failing tests.

QA Brief

  • Changed gettext strings should appear as "OK, Got it!" (note the capitalisation) when completing GA4 setup and when completing user input survey.
  • Start Idea Hub setup but cancel mid-way through and ensure "Complete setup" not "Complete set up" appears as the CTA on the Site Kit dashboard.

Changelog entry

tofumatt avatar Aug 04 '22 19:08 tofumatt

@tofumatt @techanvil how do we suppose to deal with duplicates? Let say we have Get Help and Get help strings, then we can use Get help text and text-transform: capitalize style for the Get Help case, but how about punctuations? We can't remove it from translatable strings because punctuations have different rules in different languages, thus we have to leave them in translatable strings, right? Let's probably add more detailed instructions which approaches we should use.

eugene-manuilov avatar Aug 09 '22 12:08 eugene-manuilov

Hi @eugene-manuilov, thanks for pointing that out - definitely a good shout to expand on this.

My take on this, is the key objective is to avoid inconsistent usage of strings as they are shown to the user. So, I don't see there being a problem if there are very similar strings with different casing/punctuation as long as they are actually correct at the point of usage.

I've updated the IB with some more detail and guidance on this, please see what you think.

techanvil avatar Aug 09 '22 15:08 techanvil

Thanks, @techanvil. IB ✔️

eugene-manuilov avatar Aug 09 '22 15:08 eugene-manuilov

QA Update ⚠️

Verified this for user input survey and idea hub module and for other modules also for consistency. But, I'm getting GA4 activation banner on develop branch. I don't have access to elasicpress.io site. As per Darren he is getting banner using developer plugin and elasticpress.io site.

image

User survery:

image

Idea Hub :

image

Analytics :

image

cc @wpdarren assigning to you if you verify this using elasticpress.io site.

mohitwp avatar Sep 05 '22 02:09 mohitwp

QA Update: ✅

Verified:

  • When completing GA4 setup, the CTA button states "OK, Got it!"

  • When completing user input survey, the CTA button states "OK, Got it!"

  • When setting up Idea Hub and cancel mid-way through the CTA states "Complete setup"

Screenshots

image

image

image

wpdarren avatar Sep 05 '22 02:09 wpdarren